↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Träfflista för sökning "L773:1063 6919 OR L773:9781467388511 "

Sökning: L773:1063 6919 OR L773:9781467388511

Resultat 1-50 av 53

Sortera/gruppera träfflistan

Sortering: Träffar per sida:

Numrering	Referens	Omslagsbild	Hitta
1.	Bylow, Erik, et al. (författare) Minimizing the maximal rank 2016 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. - 9781467388511 ; 2016-January, s. 5887-5895 Konferensbidrag (refereegranskat)abstract In computer vision, many problems can be formulated as finding a low rank approximation of a given matrix. Ideally, if all elements of the measurement matrix are available, this is easily solved in the L2-norm using factorization. However, in practice this is rarely the case. Lately, this problem has been addressed using different approaches, one is to replace the rank term by the convex nuclear norm, another is to derive the convex envelope of the rank term plus a data term. In the latter case, matrices are divided into sub-matrices and the envelope is computed for each subblock individually. In this paper a new convex envelope is derived which takes all sub-matrices into account simultaneously. This leads to a simpler formulation, using only one parameter to control the trade-of between rank and data fit, for applications where one seeks low rank approximations of multiple matrices with the same rank. We show in this paper how our general framework can be used for manifold denoising of several images at once, as well as just denoising one image. Experimental comparisons show that our method achieves results similar to state-of-the-art approaches while being applicable for other problems such as linear shape model estimation.
2.	Fredriksson, Johan, et al. (författare) Optimal relative pose with unknown correspondences 2016 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. - 9781467388511 ; 2016-January, s. 1728-1736 Konferensbidrag (refereegranskat)abstract Previous work on estimating the epipolar geometry of two views relies on being able to reliably match feature points based on appearance. In this paper, we go one step further and show that it is feasible to compute both the epipolar geometry and the correspondences at the same time based on geometry only. We do this in a globally optimal manner. Our approach is based on an efficient branch and bound technique in combination with bipartite matching to solve the correspondence problem. We rely on several recent works to obtain good bounding functions to battle the combinatorial explosion of possible matchings. It is experimentally demonstrated that more difficult cases can be handled and that more inlier correspondences can be obtained by being less restrictive in the matching phase.
3.	Nasihatkon, Seyed Behrooz, 1983, et al. (författare) Globally optimal rigid intensity based registration: A fast fourier domain approach 2016 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. - 9781467388511 ; 2016-January, s. 5936-5944 Konferensbidrag (refereegranskat)abstract High computational cost is the main obstacle to adapting globally optimal branch-and-bound algorithms to intensity-based registration. Existing techniques to speed up such algorithms use a multiresolution pyramid of images and bounds on the target function among different resolutions for rigidly aligning two images. In this paper, we propose a dual algorithm in which the optimization is done in the Fourier domain, and multiple resolution levels are replaced by multiple frequency bands. The algorithm starts by computing the target function in lower frequency bands and keeps adding higher frequency bands until the current subregion is either rejected or divided into smaller areas in a branch and bound manner. Unlike spatial multiresolution approaches, to compute the target function for a wider frequency area, one just needs to compute the target in the residual bands. Therefore, if an area is to be discarded, it performs just enough computations required for the rejection. This property also enables us to use a rather large number of frequency bands compared to the limited number of resolution levels used in the space domain algorithm. Experimental results on real images demonstrate considerable speed gains over the space domain method in most cases.
4.	Ask, Erik, et al. (författare) Optimal Geometric Fitting Under the Truncated L-2-Norm 2013 Ingår i: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). - 1063-6919. ; , s. 1722-1729 Konferensbidrag (refereegranskat)abstract This paper is concerned with model fitting in the presence of noise and outliers. Previously it has been shown that the number of outliers can be minimized with polynomial complexity in the number of measurements. This paper improves on these results in two ways. First, it is shown that for a large class of problems, the statistically more desirable truncated L-2-norm can be optimized with the same complexity. Then, with the same methodology, it is shown how to transform multi-model fitting into a purely combinatorial problem-with worst-case complexity that is polynomial in the number of measurements, though exponential in the number of models. We apply our framework to a series of hard registration and stitching problems demonstrating that the approach is not only of theoretical interest. It gives a practical method for simultaneously dealing with measurement noise and large amounts of outliers for fitting problems with low-dimensional models.
5.	Balabanov, Oleksandr, et al. (författare) Bayesian Posterior Approximation With Stochastic Ensembles 2023 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. - 9798350301298 ; 2023-June, s. 13701-13711 Konferensbidrag (refereegranskat)abstract We introduce ensembles of stochastic neural networks to approximate the Bayesian posterior, combining stochastic methods such as dropout with deep ensembles. The stochas-tic ensembles are formulated as families of distributions and trained to approximate the Bayesian posterior with variational inference. We implement stochastic ensembles based on Monte Carlo dropout, DropConnect and a novel non-parametric version of dropout and evaluate them on a toy problem and CIFAR image classification. For both tasks, we test the quality of the posteriors directly against Hamil-tonian Monte Carlo simulations. Our results show that stochastic ensembles provide more accurate posterior esti-mates than other popular baselines for Bayesian inference.
6.	Berthilsson, Rikard, et al. (författare) Reconstruction of 3D-Curves from 2D-Images Using Affine Shape Methods for Curves 1997 Ingår i: [Host publication title missing]. - 1063-6919. ; , s. 476-481 Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract In this paper, we propose an algorithm for doing reconstruction of general 3D-curves from a number of 2D-images taken by uncalibrated cameras. No point correspondences between the images are assumed. The curve and the view points are uniquely reconstructed, modulo common projective transformations and the point correspondence problem is solved. Furthermore, the algorithm is independent of the choice of coordinates, as it is based on orthogonal projections and aligning subspaces. The algorithm is based on an extension of affine shape of finite point configurations to more general objects.
7.	Berthilsson, Rikard, et al. (författare) Recursive Structure and Motion from Image Sequences using Shape and Depth Spaces 1997 Ingår i: [Host publication title missing]. - 1063-6919. ; , s. 444-449 Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract In this paper a novel recursive method for estimating structure and motion from image sequences is presented. The novelty lies in the fact that the output of the algorithm is independent of the chosen coordinate systems in the images as well as the ordering of the points. It relies on subspace methods and is derived from both ordinary coordinate representations and camera matrices and from a so called depth and shape analysis. Furthermore, no initial phase is needed to start up the algorithm. It starts directly with the first two images and incorporates new images as soon as new corresponding points are obtained. The performance of the algorithm is shown on simulated data. Moreover, the two different approaches, one using camera matrices and the other using the concepts of affine shape and depth, are unified into a general theory of structure and motion from image sequences.
8.	Broomé, Sofia, et al. (författare) Dynamics are important for the recognition of equine pain in video 2019 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - : Institute of Electrical and Electronics Engineers (IEEE). - 1063-6919. Konferensbidrag (refereegranskat)abstract A prerequisite to successfully alleviate pain in animals is to recognize it, which is a great challenge in non-verbal species. Furthermore, prey animals such as horses tend to hide their pain. In this study, we propose a deep recurrent two-stream architecture for the task of distinguishing pain from non-pain in videos of horses. Different models are evaluated on a unique dataset showing horses under controlled trials with moderate pain induction, which has been presented in earlier work. Sequential models are experimentally compared to single-frame models, showing the importance of the temporal dimension of the data, and are benchmarked against a veterinary expert classification of the data. We additionally perform baseline comparisons with generalized versions of state-of-the-art human pain recognition methods. While equine pain detection in machine learning is a novel field, our results surpass veterinary expert performance and outperform pain detection results reported for other larger non-human species.
9.	Bökman, Georg, 1994, et al. (författare) ZZ-Net: A Universal Rotation Equivariant Architecture for 2D Point Clouds 2022 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - : IEEE Computer Society. - 1063-6919. ; 2022-June, s. 10966-10975 Konferensbidrag (refereegranskat)abstract In this paper, we are concerned with rotation equivariance on 2D point cloud data. We describe a particular set of functions able to approximate any continuous rotation equivariant and permutation invariant function. Based on this result, we propose a novel neural network architecture for processing 2D point clouds and we prove its universality for approximating functions exhibiting these symmetries. We also show how to extend the architecture to accept a set of 2D-2D correspondences as indata, while maintaining similar equivariance properties. Experiments are presented on the estimation of essential matrices in stereo vision.
10.	Camposeco, Federico, et al. (författare) Hybrid scene Compression for Visual Localization 2019 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; 2019-June, s. 7645-7654 Konferensbidrag (refereegranskat)abstract Localizing an image w.r.t. a 3D scene model represents a core task for many computer vision applications. An increasing number of real-world applications of visual localization on mobile devices, e.g., Augmented Reality or autonomous robots such as drones or self-driving cars, demand localization approaches to minimize storage and bandwidth requirements. Compressing the 3D models used for localization thus becomes a practical necessity. In this work, we introduce a new hybrid compression algorithm that uses a given memory limit in a more effective way. Rather than treating all 3D points equally, it represents a small set of points with full appearance information and an additional, larger set of points with compressed information. This enables our approach to obtain a more complete scene representation without increasing the memory requirements, leading to a superior performance compared to previous compression schemes. As part of our contribution, we show how to handle ambiguous matches arising from point compression during RANSAC. Besides outperforming previous compression techniques in terms of pose accuracy under the same memory constraints, our compression scheme itself is also more efficient. Furthermore, the localization rates and accuracy obtained with our approach are comparable to state-of-the-art feature-based methods, while using a small fraction of the memory.
11.	Chelani, Kunal, 1992, et al. (författare) How privacy-preserving are line clouds? Recovering scene details from 3D lines 2021 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; , s. 15663-15673 Konferensbidrag (refereegranskat)abstract Visual localization is the problem of estimating the camera pose of a given image with respect to a known scene. Visual localization algorithms are a fundamental building block in advanced computer vision applications, including Mixed and Virtual Reality systems. Many algorithms used in practice represent the scene through a Structure-from-Motion (SfM) point cloud and use 2D-3D matches between a query image and the 3D points for camera pose estimation. As recently shown, image details can be accurately recovered from SfM point clouds by translating renderings of the sparse point clouds to images. To address the resulting potential privacy risks for user-generated content, it was recently proposed to lift point clouds to line clouds by replacing 3D points by randomly oriented 3D lines passing through these points. The resulting representation is unintelligible to humans and effectively prevents point cloud-to-image translation. This paper shows that a significant amount of information about the 3D scene geometry is preserved in these line clouds, allowing us to (approximately) recover the 3D point positions and thus to (approximately) recover image content. Our approach is based on the observation that the closest points between lines can yield a good approximation to the original 3D points. Code is available at https://github.com/kunalchelani/Line2Point.
12.	Chelani, Kunal, 1992, et al. (författare) Privacy-Preserving Representations are not Enough: Recovering Scene Content from Camera Poses 2023 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; 2023-June, s. 13132-13141 Konferensbidrag (refereegranskat)abstract Visual localization is the task of estimating the camera pose from which a given image was taken and is central to several 3D computer vision applications. With the rapid growth in the popularity of AR/VR/MR devices and cloudbased applications, privacy issues are becoming a very important aspect of the localization process. Existing work on privacy-preserving localization aims to defend against an attacker who has access to a cloud-based service. In this paper, we show that an attacker can learn about details of a scene without any access by simply querying a localization service. The attack is based on the observation that modern visual localization algorithms are robust to variations in appearance and geometry. While this is in general a desired property, it also leads to algorithms localizing objects that are similar enough to those present in a scene. An attacker can thus query a server with a large enough set of images of objects, e.g., obtained from the Internet, and some of them will be localized. The attacker can thus learn about object placements from the camera poses returned by the service (which is the minimal information returned by such a service). In this paper, we develop a proof-of-concept version of this attack and demonstrate its practical feasibility. The attack does not place any requirements on the localization algorithm used, and thus also applies to privacy-preserving representations. Current work on privacy-preserving representations alone is thus insufficient.
13.	Danelljan, Martin, 1989-, et al. (författare) A Probabilistic Framework for Color-Based Point Set Registration 2016 Ingår i: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). - : Institute of Electrical and Electronics Engineers (IEEE). - 9781467388511 - 9781467388528 ; , s. 1818-1826 Konferensbidrag (refereegranskat)abstract In recent years, sensors capable of measuring both color and depth information have become increasingly popular. Despite the abundance of colored point set data, state-of-the-art probabilistic registration techniques ignore the available color information. In this paper, we propose a probabilistic point set registration framework that exploits available color information associated with the points. Our method is based on a model of the joint distribution of 3D-point observations and their color information. The proposed model captures discriminative color information, while being computationally efficient. We derive an EM algorithm for jointly estimating the model parameters and the relative transformations. Comprehensive experiments are performed on the Stanford Lounge dataset, captured by an RGB-D camera, and two point sets captured by a Lidar sensor. Our results demonstrate a significant gain in robustness and accuracy when incorporating color information. On the Stanford Lounge dataset, our approach achieves a relative reduction of the failure rate by 78% compared to the baseline. Furthermore, our proposed model outperforms standard strategies for combining color and 3D-point information, leading to state-of-the-art results.
14.	Danelljan, Martin, 1989-, et al. (författare) Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking 2016 Ingår i: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). - : Institute of Electrical and Electronics Engineers (IEEE). - 9781467388511 - 9781467388528 ; , s. 1430-1438 Konferensbidrag (refereegranskat)abstract Tracking-by-detection methods have demonstrated competitive performance in recent years. In these approaches, the tracking model heavily relies on the quality of the training set. Due to the limited amount of labeled training data, additional samples need to be extracted and labeled by the tracker itself. This often leads to the inclusion of corrupted training samples, due to occlusions, misalignments and other perturbations. Existing tracking-by-detection methods either ignore this problem, or employ a separate component for managing the training set. We propose a novel generic approach for alleviating the problem of corrupted training samples in tracking-by-detection frameworks. Our approach dynamically manages the training set by estimating the quality of the samples. Contrary to existing approaches, we propose a unified formulation by minimizing a single loss over both the target appearance model and the sample quality weights. The joint formulation enables corrupted samples to be down-weighted while increasing the impact of correct ones. Experiments are performed on three benchmarks: OTB-2015 with 100 videos, VOT-2015 with 60 videos, and Temple-Color with 128 videos. On the OTB-2015, our unified formulation significantly improves the baseline, with a gain of 3.8% in mean overlap precision. Finally, our method achieves state-of-the-art results on all three datasets.
15.	Dusmanu, Mihai, et al. (författare) D2-Net: A Trainable CNN for Joint Description and Detection of Local Features 2019 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; 2019-June, s. 8084-8093 Konferensbidrag (refereegranskat)abstract In this work we address the problem of finding reliable pixel-level correspondences under difficult imaging conditions. We propose an approach where a single convolutional neural network plays a dual role: It is simultaneously a dense feature descriptor and a feature detector. By postponing the detection to a later stage, the obtained keypoints are more stable than their traditional counterparts based on early detection of low-level structures. We show that this model can be trained using pixel correspondences extracted from readily available large-scale SfM reconstructions, without any further annotations. The proposed method obtains state-of-the-art performance on both the difficult Aachen Day-Night localization dataset and the InLoc indoor localization benchmark, as well as competitive performance on other benchmarks for image matching and 3D reconstruction.
16.	Enqvist, Olof, et al. (författare) A Brute-Force Algorithm for Reconstructing a Scene from Two Projections 2011 Ingår i: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2011. - 1063-6919. ; , s. 2961-2968 Konferensbidrag (refereegranskat)abstract Is the real problem in finding the relative orientation of two viewpoints the correspondence problem? We argue that this is only one difficulty. Even with known correspondences, popular methods like the eight point algorithm and minimal solvers may break down due to planar scenes or small relative motions. In this paper, we derive a simple, brute-force algorithm which is both robust to outliers and has no such algorithmic degeneracies. Several cost functions are explored including maximizing the consensus set and robust norms like truncated least-squares. Our method is based on parameter search in a four-dimensional space using a new epipolar parametrization. In principle, we do an exhaustive search of parameter space, but the computations are very simple and easily parallelizable, resulting in an efficient method. Further speedups can be obtained by restricting the domain of possible motions to, for example, planar motions or small rotations. Experimental results are given for a variety of scenarios including scenes with a large portion of outliers. Further, we apply our algorithm to 3D motion segmentation where we outperform state-of-the-art on the well-known Hopkins-155 benchmark database.
17.	Eriksson, Anders, 1972, et al. (författare) Rotation Averaging and Strong Duality 2018 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. - 9781538664209 ; , s. 127-135 Konferensbidrag (refereegranskat)abstract In this paper we explore the role of duality principles within the problem of rotation averaging, a fundamental task in a wide range of computer vision applications. In its conventional form, rotation averaging is stated as a minimization over multiple rotation constraints. As these constraints are non-convex, this problem is generally considered challenging to solve globally. We show how to circumvent this difficulty through the use of Lagrangian duality. While such an approach is well-known it is normally not guaranteed to provide a tight relaxation. Based on spectral graph theory, we analytically prove that in many cases there is no duality gap unless the noise levels are severe. This allows us to obtain certifiably global solutions to a class of important non-convex problems in polynomial time. We also propose an efficient, scalable algorithm that out-performs general purpose numerical solvers and is able to handle the large problem instances commonly occurring in structure from motion settings. The potential of this proposed method is demonstrated on a number of different problems, consisting of both synthetic and real-world data.
18.	Fieraru, Mihai, et al. (författare) Three-dimensional reconstruction of human interactions 2020 Ingår i: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). - 1063-6919. - 9781728171685 ; , s. 7212-7221 Konferensbidrag (refereegranskat)abstract Understanding 3d human interactions is fundamental for fine grained scene analysis and behavioural modeling. However, most of the existing models focus on analyzing a single person in isolation, and those who process several people focus largely on resolving multi-person data association, rather than inferring interactions. This may lead to incorrect, lifeless 3d estimates, that miss the subtle human contact aspects–the essence of the event–and are of little use for detailed behavioral understanding. This paper addresses such issues and makes several contributions: (1) we introduce models for interaction signature estimation (ISP) encompassing contact detection, segmentation, and 3d contact signature prediction; (2) we show how such components can be leveraged in order to produce augmented losses that ensure contact consistency during 3d reconstruction; (3) we construct several large datasets for learning and evaluating 3d contact prediction and reconstruction methods; specifically, we introduce CHI3D, a lab-based accurate 3d motion capture dataset with 631 sequences containing 2, 525 contact events, 728, 664 ground truth 3d poses, as well as FlickrCI3D, a dataset of 11, 216 images, with 14, 081 processed pairs of people, and 81, 233 facet-level surface correspondences within 138, 213 selected contact regions. Finally, (4) we present models and baselines to illustrate how contact estimation supports meaningful 3d reconstruction where essential interactions are captured. Models and data are made available for research purposes at http://vision.imar.ro/ci3d.
19.	Fredriksson, Johan, et al. (författare) Fast and Reliable Two-View Translation Estimation 2014 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. - 9781479951178 ; , s. 1606-1612 Konferensbidrag (refereegranskat)abstract It has long been recognized that one of the fundamental difficulties in the estimation of two-view epipolar geometry is the capability of handling outliers. In this paper, we develop a fast and tractable algorithm that maximizes the number of inliers under the assumption of a purely translating camera. Compared to classical random sampling methods, our approach is guaranteed to compute the optimal solution of a cost function based on reprojection errors and it has better time complexity. The performance is in fact independent of the inlier/outlier ratio of the data. This opens up for a more reliable approach to robust ego-motion estimation. Our basic translation estimator can be embedded into a system that computes the full camera rotation. We demonstrate the applicability in several difficult settings with large amounts of outliers. It turns out to be particularly well-suited for small rotations and rotations around a known axis (which is the case for cellular phones where the gravitation axis can be measured). Experimental results show that compared to standard RANSAC methods based on minimal solvers, our algorithm produces more accurate estimates in the presence of large outlier ratios.
20.	Iglesias, José Pedro Lopes, 1994, et al. (författare) expOSE: Accurate Initialization-Free Projective Factorization using Exponential Regularization 2023 Ingår i: 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR). - 1063-6919. - 9798350301298 ; , s. 8959-8968 Konferensbidrag (refereegranskat)abstract Bundle adjustment is a key component in practically all available Structure from Motion systems. While it is crucial for achieving accurate reconstruction, convergence to the right solution hinges on good initialization. The recently introduced factorization-based pOSE methods formulate a surrogate for the bundle adjustment error without reliance on good initialization. In this paper, we show that pOSE has an undesirable penalization of large depths. To address this we propose expOSE which has an exponential regularization that is negligible for positive depths. To achieve efficient inference we use a quadratic approximation that allows an iterative solution with VarPro. Furthermore, we extend the method with radial distortion robustness by decomposing the Object Space Error into radial and tangential components. Experimental results confirm that the proposed method is robust to initialization and improves reconstruction quality compared to state-of-the-art methods even without bundle adjustment refinement.
21.	Iglesias, José Pedro Lopes, 1994, et al. (författare) Global Optimality for Point Set Registration Using Semidefinite Programming 2020 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; 2020, s. 8284-8292, s. 8284-8292 Konferensbidrag (refereegranskat)abstract In this paper we present a study of global optimality conditions for Point Set Registration (PSR) with missing data. PSR is the problem of aligning multiple point clouds with an unknown target point cloud. Since non-linear rotation constraints are present the problem is inherently non-convex and typically relaxed by computing the Lagrange dual, which is a Semidefinite Program (SDP). In this work we show that given a local minimizer the dual variables of the SDP can be computed in closed form. This opens up the possibility of verifying the optimally, using the SDP formulation without explicitly solving it. In addition it allows us to study under what conditions the relaxation is tight, through spectral analysis. We show that if the errors in the (unknown) optimal solution are bounded the SDP formulation will be able to recover it.
22.	Ionescu, Catalin, et al. (författare) Iterated Second-Order Label Sensitive Pooling for 3D Human Pose Estimation 2014 Ingår i: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). - 1063-6919. ; , s. 1661-1668 Konferensbidrag (refereegranskat)abstract Recently, the emergence of Kinect systems has demonstrated the benefits of predicting an intermediate body part labeling for 3D human pose estimation, in conjunction with RGB-D imagery. The availability of depth information plays a critical role, so an important question is whether a similar representation can be developed with sufficient robustness in order to estimate 3D pose from RGB images. This paper provides evidence for a positive answer, by leveraging (a) 2D human body part labeling in images, (b) second-order label-sensitive pooling over dynamically computed regions resulting from a hierarchical decomposition of the body, and (c) iterative structured-output modeling to contextualize the process based on 3D pose estimates. For robustness and generalization, we take advantage of a recent large-scale 3D human motion capture dataset, Human3.6M[18] that also has human body part labeling annotations available with images. We provide extensive experimental studies where alternative intermediate representations are compared and report a substantial 33% error reduction over competitive discriminative baselines that regress 3D human pose against global HOG features.
23.	Josephson, Klas, et al. (författare) Image-based localization using hybrid feature correspondences 2007 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; , s. 2732-2739 Konferensbidrag (refereegranskat)abstract Where am I and what am I seeing? This is a classical vision problem and this paper presents a solution based on efficient use of a combination of 2D and 3D features. Given a model of a scene, the objective is to find the relative camera location of a new input image. Unlike traditional hypothesize-and-test methods that try to estimate the unknown camera position based on 3D model features only, or alternatively, based on 2D model features only, we show that using a mixture of such features, that is, a hybrid correspondence set, may improve performance. We use minimal cases of structure-from-motion for hypothesis generation in a RANSAC engine. For this purpose, several new and useful minimal cases are derived for calibrated, semi-calibrated and uncalibrated settings. Based on algebraic geometry methods, we show how these minimal hybrid cases can be solved efficiently. The whole approach has been validated on both synthetic and real data, and we demonstrate improvements compared to previous work. © 2007 IEEE.
24.	Kuang, Yubin, et al. (författare) Minimal solvers for relative pose with a single unknown radial distortion 2014 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. - 9781479951178 ; , s. 33-40 Konferensbidrag (refereegranskat)abstract In this paper, we study the problems of estimating relative pose between two cameras in the presence of radial distortion. Specifically, we consider minimal problems where one of the cameras has no or known radial distortion. There are three useful cases for this setup with a single unknown distortion: (i) fundamental matrix estimation where the two cameras are uncalibrated, (ii) essential matrix estimation for a partially calibrated camera pair, (iii) essential matrix estimation for one calibrated camera and one camera with unknown focal length. We study the parameterization of these three problems and derive fast polynomial solvers based on Gröbner basis methods. We demonstrate the numerical stability of the solvers on synthetic data. The minimal solvers have also been applied to real imagery with convincing results.
25.	Kuang, Yubin, et al. (författare) Partial Symmetry in Polynomial Systems and Its Application in Computer Vision 2014 Ingår i: [Host publication title missing]. - 1063-6919. ; , s. 438-445 Konferensbidrag (refereegranskat)abstract Algorithms for solving systems of polynomial equations are key components for solving geometry problems in computer vision. Fast and stable polynomial solvers are essential for numerous applications e.g. minimal problems or finding for all stationary points of certain algebraic errors. Recently, full symmetry in the polynomial systems has been utilized to simplify and speed up state-of-the-art polynomial solvers based on Gr¨obner basis method. In this paper, we further explore partial symmetry (i.e. where the symmetry lies in a subset of the variables) in the polynomial systems. We develop novel numerical schemes to utilize such partial symmetry. We then demonstrate the advantage of our schemes in several computer vision problems. In both synthetic and real experiments, we show that utilizing partial symmetry allow us to obtain faster and more accurate polynomial solvers than the general solvers.
26.	Larsson, Måns, 1989, et al. (författare) A cross-season correspondence dataset for robust semantic segmentation 2019 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; 2019-June, s. 9524-9534 Konferensbidrag (refereegranskat)abstract In this paper, we present a method to utilize 2D-2D point matches between images taken during different image conditions to train a convolutional neural network for semantic segmentation. Enforcing label consistency across the matches makes the final segmentation algorithm robust to seasonal changes. We describe how these 2D-2D matches can be generated with little human interaction by geometrically matching points from 3D models built from images. Two cross-season correspondence datasets are created providing 2D-2D matches across seasonal changes as well as from day to night. The datasets are made publicly available to facilitate further research. We show that adding the correspondences as extra supervision during training improves the segmentation performance of the convolutional neural network, making it more robust to seasonal changes and weather conditions.
27.	Larsson, Viktor, et al. (författare) Compact Matrix Factorization with Dependent Subspaces 2017 Ingår i: 30th IEEE Conference on Computer Vision and Pattern Recognition. - : IEEE. - 1063-6919. - 9781538604571 ; 2017-January, s. 4361-4370 Konferensbidrag (refereegranskat)abstract Traditional matrix factorization methods approximate high dimensional data with a low dimensional subspace. This imposes constraints on the matrix elements which allow for estimation of missing entries. A lower rank provides stronger constraints and makes estimation of the missing entries less ambiguous at the cost of measurement fit. In this paper we propose a new factorization model that further constrains the matrix entries. Our approach can be seen as a unification of traditional low-rank matrix factorization and the more recent union-of-subspace approach. It adaptively finds clusters that can be modeled with low dimensional local subspaces and simultaneously uses a global rank constraint to capture the overall scene interactions. For inference we use an energy that penalizes a trade-off between data fit and degrees-of-freedom of the resulting factorization. We show qualitatively and quantitatively that regularizing both local and global dynamics yields significantly improved missing data estimation.
28.	Le, Huu, 1988, et al. (författare) A Graduated Filter Method for Large Scale Robust Estimation 2020 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; , s. 5558-5567 Konferensbidrag (refereegranskat)abstract Due to the highly non-convex nature of large-scale robust parameter estimation, avoiding poor local minima is challenging in real-world applications where input data is contaminated by a large or unknown fraction of outliers. In this paper, we introduce a novel solver for robust estimation that possesses a strong ability to escape poor local minima. Our algorithm is built upon the class of traditional graduated optimization techniques, which are considered state-of-the-art local methods to solve problems having many poor minima. The novelty of our work lies in the introduction of an adaptive kernel (or residual) scaling scheme, which allows us to achieve faster convergence rates. Like other existing methods that aim to return good local minima for robust estimation tasks, our method relaxes the original robust problem but adapts a filter framework from non-linear constrained optimization to automatically choose the level of relaxation. Experimental results on real large-scale datasets such as bundle adjustment instances demonstrate that our proposed method achieves competitive results.
29.	Le, Huu, 1988, et al. (författare) AdaSTE: An Adaptive Straight-Through Estimator to Train Binary Neural Networks 2022 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; 2022-June, s. 460-469 Konferensbidrag (refereegranskat)abstract We propose a new algorithm for training deep neural networks (DNNs) with binary weights. In particular, we first cast the problem of training binary neural networks (BiNNs) as a bilevel optimization instance and subsequently construct flexible relaxations of this bilevel program. The resulting training method shares its algorithmic simplicity with several existing approaches to train BiNNs, in particular with the straight-through gradient estimator successfully employed in BinaryConnect and subsequent methods. In fact, our proposed method can be interpreted as an adaptive variant of the original straight-through estimator that conditionally (but not always) acts like a linear mapping in the backward pass of error propagation. Experimental results demonstrate that our new algorithm offers favorable performance compared to existing approaches.
30.	Liu, Xixi, 1995, et al. (författare) GEN: Pushing the Limits of Softmax-Based Out-of-Distribution Detection 2023 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; 2023-June, s. 23946-23955 Konferensbidrag (refereegranskat)abstract Out-of-distribution (OOD) detection has been exten-sively studied in order to successfully deploy neural networks, in particular, for safety-critical applications. More-over, performing OOD detection on large-scale datasets is closer to reality, but is also more challenging. Sev-eral approaches need to either access the training data for score design or expose models to outliers during training. Some post-hoc methods are able to avoid the afore-mentioned constraints, but are less competitive. In this work, we propose Generalized ENtropy score (GEN), a simple but effective entropy-based score function, which can be applied to any pre-trained softmax-based classifier. Its performance is demonstrated on the large-scale ImageNet-lk OOD detection benchmark. It consistently improves the average AUROC across six commonly-used CNN-based and visual transformer classifiers over a num-ber of state-of-the-art post-hoc methods. The average AU- ROC improvement is at least 3.5%. Furthermore, we used GEN on top of feature-based enhancing methods as well as methods using training statistics to further improve the OOD detection performance. The code is available at: https://github.com/XixiLiu95/GEN.
31.	Mathe, Stefan, et al. (författare) Reinforcement learning for visual object detection 2016 Ingår i: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016. - 9781467388511 ; 2016-January, s. 2894-2902 Konferensbidrag (refereegranskat)abstract One of the most widely used strategies for visual object detection is based on exhaustive spatial hypothesis search. While methods like sliding windows have been successful and effective for many years, they are still brute-force, independent of the image content and the visual category being searched. In this paper we present principled sequential models that accumulate evidence collected at a small set of image locations in order to detect visual objects effectively. By formulating sequential search as reinforcement learning of the search policy (including the stopping condition), our fully trainable model can explicitly balance for each class, specifically, the conflicting goals of exploration - sampling more image regions for better accuracy-, and exploitation - stopping the search efficiently when sufficiently confident about the target's location. The methodology is general and applicable to any detector response function. We report encouraging results in the PASCAL VOC 2012 object detection test set showing that the proposed methodology achieves almost two orders of magnitude speed-up over sliding window methods.
32.	Miraldo, Pedro, et al. (författare) A Unified Model for Line Projections in Catadioptric Cameras With Rotationally Symmetric Mirrors 2022 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; 2022-June, s. 15776-15785 Konferensbidrag (refereegranskat)abstract Lines are among the most used computer vision features, in applications such as camera calibration to object detection. Catadioptric cameras with rotationally symmetric mirrors are omnidirectional imaging devices, capturing up to a 360 degrees field of view. These are used in many applications ranging from robotics to panoramic vision. Although known for some specific configurations, the modeling of line projection was never fully solved for general central and non-central catadioptric cameras. We start by taking some general point reflection assumptions and derive a line reflection constraint. This constraint is then used to define a line projection into the image. Next, we compare our model with previous methods, showing that our general approach outputs the same polynomial degrees as previous configuration-specific systems. We run several experiments using synthetic and real-world data, validating our line projection model. Lastly, we show an application of our methods to an absolute camera pose problem.
33.	Olsson, Carl, et al. (författare) A polynomial-time bound for matching and registration with outliers 2008 Ingår i: 2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12. - 1063-6919. ; , s. 3230-3237 Konferensbidrag (refereegranskat)abstract We present a framework for computing optimal transformations, aligning one point set to another, in the presence of outliers. Example applications include shape matching and registration (using, for example, similarity, affine or projective transformations) as well as multiview reconstruction problems (triangulation, camera pose etc.). While standard methods like RANSAC essentially use heuristics to cope with outliers, we seek to find the largest possible subset of consistent correspondences and the globally optimal transformation aligning the point sets. Based on theory from computational geometry, we show that this is indeed possible to accomplish in polynomial-time. We develop several algorithms which make efficient use of convex programming. The scheme has been tested and evaluated on both synthetic and real data for several applications.
34.	Olsson, Carl, 1978, et al. (författare) A quasiconvex formulation for radial cameras 2021 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; , s. 14571-14580 Konferensbidrag (refereegranskat)abstract In this paper we study structure from motion problems for 1D radial cameras. Under this model the projection of a 3D point is a line in the image plane going through the principal point, which makes the model invariant to radial distortion and changes in focal length. It can therefore effectively be applied to uncalibrated image collections without the need for explicit estimation of camera intrinsics. We show that the reprojection errors of 1D radial cameras are examples of quasiconvex functions. This opens up the possibility to solve a general class of relevant reconstruction problems globally optimally using tools from convex optimization. In fact, our resulting algorithm is based on solving a series of LP problems. We perform an extensive experimental evaluation, on both synthetic and real data, showing that a whole class of multiview geometry problems across a range of different cameras models with varying and unknown intrinsic calibration can be reliably and accurately solved within the same framework.
35.	Olsson, Carl, et al. (författare) Curvature-Based Regularization for Surface Approximation 2012 Ingår i: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). - 1063-6919. ; , s. 1576-1583 Konferensbidrag (refereegranskat)abstract We propose an energy-based framework for approximating surfaces from a cloud of point measurements corrupted by noise and outliers. Our energy assigns a tangent plane to each (noisy) data point by minimizing the squared distances to the points and the irregularity of the surface implicitly defined by the tangent planes. In order to avoid the well-known "shrinking" bias associated with first-order surface regularization, we choose a robust smoothing term that approximates curvature of the underlying surface. In contrast to a number of recent publications estimating curvature using discrete (e. g. binary) labellings with triple-cliques we use higher-dimensional labels that allows modeling curvature with only pair-wise interactions. Hence, many standard optimization algorithms (e. g. message passing, graph cut, etc) can minimize the proposed curvature-based regularization functional. The accuracy of our approach for representing curvature is demonstrated by theoretical and empirical results on synthetic and real data sets from multi-view reconstruction and stereo. (1)
36.	Olsson, Carl, et al. (författare) In Defense of 3D-Label Stereo 2013 Ingår i: Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on. - 1063-6919 .- 2163-6648. ; , s. 1730-1737 Konferensbidrag (refereegranskat)abstract It is commonly believed that higher order smoothness should be modeled using higher order interactions. For example, 2nd order derivatives for deformable (active) contours are represented by triple cliques. Similarly, the 2nd order regularization methods in stereo predominantly use MRF models with scalar (1D) disparity labels and triple clique interactions. In this paper we advocate a largely overlooked alternative approach to stereo where 2nd order surface smoothness is represented by pairwise interactions with 3D-labels, e.g. tangent planes. This general paradigm has been criticized due to perceived computational complexity of optimization in higher-dimensional label space. Contrary to popular beliefs, we demonstrate that representing 2nd order surface smoothness with 3D labels leads to simpler optimization problems with (nearly) submodular pairwise interactions. Our theoretical and experimental results demonstrate advantages over state-of-the-art methods for 2nd order smoothness stereo.
37.	Olsson, Carl, et al. (författare) Outlier Removal Using Duality 2010 Ingår i: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). - 1063-6919. - 9781424469840 ; , s. 1450-1457 Konferensbidrag (refereegranskat)abstract In this paper we consider the problem of outlier removal for large scale multiview reconstruction problems. An efficient and very popular method for this task is RANSAC. However, as RANSAC only works on a subset of the images, mismatches in longer point tracks may go ndetected. To deal with this problem we would like to have, as a post processing step to RANSAC, a method that works on the entire (or a larger) part of the sequence. In this paper we consider two algorithms for doing this. The first one is related to a method by Sim & Hartley where a quasiconvex problem is solved repeatedly and the error residuals with the largest error is removed. Instead of solving a quasiconvex problem in each step we show that it is enough to solve a single LP or SOCP which yields a significant speedup. Using duality we show that the same theoretical result holds for our method. The second algorithm is a faster version of the first, and it is related to the popular method of $L_1$-optimization. While it is faster and works very well in practice, there is no theoretical guarantee of success. We show that these two methods are related through duality, and evaluate the methods on a number of data sets with promising results.
38.	Olsson, Carl, et al. (författare) Projective Least-Squares: Global Solutions with Local Optimization 2009 Ingår i: CVPR: 2009 IEEE Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; , s. 1216-1223 Konferensbidrag (refereegranskat)abstract Recent work in multiple view geometry has focused on obtaining globally optimal solutions at the price of computational time efficiency. On the other hand, traditional bundle adjustment algorithms have been found to provide good solutions even though there may be multiple local minima. In this paper we justify this observation by giving a simple sufficient condition for global optimality that can be used to verify that a solution obtained from any local method is indeed global. The method is tested on numerous problem instances of both synthetic and real data sets. In the vast majority of cases we are able to verify that the solutions are optimal, in particular for small-scale problems. We also develop a branch and bound procedure that goes beyond verification. In cases where the sufficient condition does not hold, the algorithm returns either of the following two results: (i) a certificate of global optimality for the local solution or (ii) the global solution.
39.	Olsson, Carl, et al. (författare) Solving large scale binary quadratic problems: Spectral methods vs. Semidefinite programming 2007 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; , s. 1776-1783 Konferensbidrag (refereegranskat)abstract In this paper we introduce two new methods for solving binary quadratic problems. While spectral relaxation methods have been the workhorse subroutine for a wide variety of computer vision problems - segmentation, clustering, image restoration to name a few - it has recently been challenged by semidefinite programming (SDP) relaxations. In fact, it can be shown that SDP relaxations produce better lower bounds than spectral relaxations on binary problems with a quadratic objective junction. On the other hand, the computational complexity for SDP increases rapidly as the number of decision variables grows making them inapplicable to large scale problems. Our methods combine the merits of both spectral and SDP relaxations - better (lower) bounds than traditional spectral methods and considerably faster execution times than SDP The first method is based on spectral subgradients and can be applied to large scale SDPs with binary decision variables and the second one is based on the trust region problem. Both algorithms have been applied to several large scale vision problems with good performance.1 © 2007 IEEE.
40.	Pautrat, Remi, et al. (författare) DeepLSD : Line Segment Detection and Refinement with Deep Image Gradients 2023 Ingår i: Proceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023. - 1063-6919. - 9798350301298 ; 2023-June, s. 17327-17336 Konferensbidrag (refereegranskat)abstract Line segments are ubiquitous in our human-made world and are increasingly used in vision tasks. They are complementary to feature points thanks to their spatial extent and the structural information they provide. Traditional line detectors based on the image gradient are extremely fast and accurate, but lack robustness in noisy images and challenging conditions. Their learned counterparts are more repeatable and can handle challenging images, but at the cost of a lower accuracy and a bias towards wireframe lines. We propose to combine traditional and learned approaches to get the best of both worlds: an accurate and robust line detector that can be trained in the wild without ground truth lines. Our new line segment detector, DeepLSD, processes images with a deep network to generate a line attraction field, before converting it to a surrogate image gradient magnitude and angle, which is then fed to any existing handcrafted line detector. Additionally, we propose a new optimization tool to refine line segments based on the attraction field and vanishing points. This refinement improves the accuracy of current deep detectors by a large margin. We demonstrate the performance of our method on low-level line detection metrics, as well as on several downstream tasks using multiple challenging datasets. The source code and models are available at https://github.com/cvg/DeepLSD.
41.	Quach, Kha Gia, et al. (författare) Dyglip: A dynamic graph model with link prediction for accurate multi-camera multiple object tracking 2021 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; , s. 13779-13788 Konferensbidrag (refereegranskat)abstract Multi-Camera Multiple Object Tracking (MC-MOT) is a significant computer vision problem due to its emerging applicability in several real-world applications. Despite a large number of existing works, solving the data association problem in any MC-MOT pipeline is arguably one of the most challenging tasks. Developing a robust MC-MOT system, however, is still highly challenging due to many practical issues such as inconsistent lighting conditions, varying object movement patterns, or the trajectory occlusions of the objects between the cameras. To address these problems, this work, therefore, proposes a new Dynamic Graph Model with Link Prediction (DyGLIP) approach to solve the data association task. Compared to existing methods, our new model offers several advantages, including better feature representations and the ability to recover from lost tracks during camera transitions. Moreover, our model works gracefully regardless of the overlapping ratios between the cameras. Experimental results show that we outperform existing MC-MOT algorithms by a large margin on several practical datasets. Notably, our model works favorably on online settings but can be extended to an incremental approach for large-scale datasets.
42.	Sarlin, Paul-Edouard, et al. (författare) Back to the Feature: Learning Robust Camera Localization from Pixels to Pose 2021 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; , s. 3246-3256 Konferensbidrag (refereegranskat)abstract Camera pose estimation in known scenes is a 3D geometry task recently tackled by multiple learning algorithms. Many regress precise geometric quantities, like poses or 3D points, from an input image. This either fails to generalize to new viewpoints or ties the model parameters to a specific scene. In this paper, we go Back to the Feature: we argue that deep networks should focus on learning robust and invariant visual features, while the geometric estimation should be left to principled algorithms. We introduce PixLoc, a scene-agnostic neural network that estimates an accurate 6-DoF pose from an image and a 3D model. Our approach is based on the direct alignment of multiscale deep features, casting camera localization as metric learning. PixLoc learns strong data priors by end-to-end training from pixels to pose and exhibits exceptional generalization to new scenes by separating model parameters and scene geometry. The system can localize in large environments given coarse pose priors but also improve the accuracy of sparse feature matching by jointly refining keypoints and poses with little overhead. The code will be publicly available at github.com/cvg/pixloc.
43.	Sattler, Torsten, et al. (författare) Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions 2018 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. - 9781538664209 ; , s. 8601-8610 Konferensbidrag (refereegranskat)abstract Visual localization enables autonomous vehicles to navigate in their surroundings and augmented reality applica-tions to link virtual to real worlds. Practical visual localization approaches need to be robust to a wide variety of viewing condition, including day-night changes, as well as weather and seasonal variations, while providing highly accurate 6 degree-of-freedom (6DOF) camera pose estimates. In this paper, we introduce the first benchmark datasets specifically designed for analyzing the impact of such factors on visual localization. Using carefully created ground truth poses for query images taken under a wide variety of conditions, we evaluate the impact of various factors on 6DOF camera pose estimation accuracy through extensive experiments with state-of-the-art localization approaches. Based on our results, we draw conclusions about the difficulty of different conditions, showing that long-term localization is far from solved, and propose promising avenues for future work, including sequence-based localization approaches and the need for better local features. Our benchmark is available at visuallocalization.net
44.	Sattler, Torsten, 1983, et al. (författare) Understanding the Limitations of CNN-based Absolute Camera Pose Regression 2019 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; 2019-June, s. 3297-3307 Konferensbidrag (refereegranskat)abstract Visual localization is the task of accurate camera pose estimation in a known scene. It is a key problem in computer vision and robotics, with applications including selfdriving cars, Structure-from-Motion, SLAM, and Mixed Reality. Traditionally, the localization problem has been tackled using 3D geometry. Recently, end-to-end approaches based on convolutional neural networks have become popular. These methods learn to directly regress the camera pose from an input image. However, they do not achieve the same level of pose accuracy as 3D structure-based methods. To understand this behavior, we develop a theoretical model for camera pose regression. We use our model to predict failure cases for pose regression techniques and verify our predictions through experiments. We furthermore use our model to show that pose regression is more closely related to pose approximation via image retrieval than to accurate pose estimation via 3D structure. A key result is that current approaches do not consistently outperform a handcrafted image retrieval baseline. This clearly shows that additional research is needed before pose regression algorithms are ready to compete with structure-based methods
45.	Schops, Thomas, et al. (författare) Bad slam: Bundle adjusted direct RGB-D slam 2019 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; 2019-June, s. 134-144 Konferensbidrag (refereegranskat)abstract A key component of Simultaneous Localization and Mapping (SLAM) systems is the joint optimization of the estimated 3D map and camera trajectory. Bundle adjustment (BA) is the gold standard for this. Due to the large number of variables in dense RGB-D SLAM, previous work has focused on approximating BA. In contrast, in this paper we present a novel, fast direct BA formulation which we implement in a real-time dense RGB-D SLAM algorithm. In addition, we show that direct RGB-D SLAM systems are highly sensitive to rolling shutter, RGB and depth sensor synchronization, and calibration errors. In order to facilitate state-of-the-art research on direct RGB-D SLAM, we propose a novel, well-calibrated benchmark for this task that uses synchronized global shutter RGB and depth cameras. It includes a training set, a test set without public ground truth, and an online evaluation service. We observe that the ranking of methods changes on this dataset compared to existing ones, and our proposed algorithm outperforms all other evaluated SLAM methods. Our benchmark and our open source SLAM algorithm are available at: www.eth3d.net.
46.	Schops, Thomas, et al. (författare) Why Having 10,000 Parameters in Your Camera Model Is Better Than Twelve 2020 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; , s. 2532-2541 Konferensbidrag (refereegranskat)abstract Camera calibration is an essential first step in setting up 3D Computer Vision systems. Commonly used parametric camera models are limited to a few degrees of freedom and thus often do not optimally fit to complex real lens distortion. In contrast, generic camera models allow for very accurate calibration due to their flexibility. Despite this, they have seen little use in practice. In this paper, we argue that this should change. We propose a calibration pipeline for generic models that is fully automated, easy to use, and can act as a drop-in replacement for parametric calibration, with a focus on accuracy. We compare our results to parametric calibrations. Considering stereo depth estimation and camera pose estimation as examples, we show that the calibration error acts as a bias on the results. We thus argue that in contrast to current common practice, generic models should be preferred over parametric ones whenever possible. To facilitate this, we released our calibration pipeline at https://github.com/puzzlepaint/camera_calibration, making both easy-to-use and accurate camera calibration available to everyone.
47.	Stewenius, Henrik, et al. (författare) A minimal solution for relative pose with unknown focal length 2005 Ingår i: 2005 IEEE, Conference on Computer Vision and Pattern Recognition, Proceedings. - 1063-6919. ; , s. 789-794 Konferensbidrag (refereegranskat)abstract Assume that we have two perspective images with known intrinsic parameters except for an unknown common focal length. It is a minimally constrained problem to find the relative orientation between the two images given six corresponding points. We present an efficient solution to the problem and show that there are 15 solutions in general (including complex solutions). To the best of our knowledge this was a previously unsolved problem. The solutions are found through eigen-decomposition of a 15 x 15 matrix. The matrix itself is generated in closed form. We demonstrate through practical experiments that the algorithm is correct and numerically stable.
48.	Strandmark, Petter, et al. (författare) Parallel and Distributed Graph Cuts by Dual Decomposition 2010 Ingår i: IEEE Conference on Computer Vision and Pattern Recognition. - 1063-6919. - 9781424469840 ; , s. 2085-2092 Konferensbidrag (refereegranskat)abstract Graph cuts methods are at the core of many state-of-the-art algorithms in computer vision due to their efficiency in computing globally optimal solutions. In this paper, we solve the maximum flow/minimum cut problem in parallel by splitting the graph into multiple parts and hence, further increase the computational efficacy of graph cuts. Optimality of the solution is guaranteed by dual decomposition, or more specifically, the solutions to the subproblems are constrained to be equal on the overlap with dual variables. We demonstrate that our approach both allows (i) faster processing on multi-core computers and (ii) the capability to handle larger problems by splitting the graph across multiple computers on a distributed network. Even though our approach does not give a theoretical guarantee of speed-up, an extensive empirical evaluation on several applications with many different data sets consistently shows good performance. An open source C++ implementation of the dual decomposition method is also made publicly available.
49.	Svärm, Linus, et al. (författare) Accurate Localization and Pose Estimation for Large 3D Models 2014 Ingår i: Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. - 1063-6919. - 9781479951178 ; , s. 532-539 Konferensbidrag (refereegranskat)abstract We consider the problem of localizing a novel image in a large 3D model. In principle, this is just an instance of camera pose estimation, but the scale introduces some challenging problems. For one, it makes the correspondence problem very difficult and it is likely that there will be a significant rate of outliers to handle. In this paper we use recent theoretical as well as technical advances to tackle these problems. Many modern cameras and phones have gravitational sensors that allow us to reduce the search space. Further, there are new techniques to efficiently and reliably deal with extreme rates of outliers. We extend these methods to camera pose estimation by using accurate approximations and fast polynomial solvers. Experimental results are given demonstrating that it is possible to reliably estimate the camera pose despite more than 99% of outlier correspondences.
50.	Truong, Giang, et al. (författare) Unsupervised Learning for Robust Fitting: A Reinforcement Learning Approach 2021 Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. ; , s. 10343-10352 Konferensbidrag (refereegranskat)abstract Robust model fitting is a core algorithm in a large number of computer vision applications. Solving this problem efficiently for datasets highly contaminated with outliers is, however, still challenging due to the underlying computational complexity. Recent literature has focused on learning-based algorithms. However, most approaches are supervised (which require a large amount of labelled training data). In this paper, we introduce a novel unsupervised learning framework that learns to directly solve robust model fitting. Unlike other methods, our work is agnostic to the underlying input features, and can be easily generalized to a wide variety of LP-type problems with quasi-convex residuals. We empirically show that our method outperforms existing unsupervised learning approaches, and achieves competitive results compared to traditional methods on several important computer vision problems.

Skapa referenser, mejla, bekava och länka

Länka till träfflistan

Resultat 1-50 av 53

Avgränsa träffmängd

Typ av publikation: konferensbidrag (53)

Typ av innehåll: refereegranskat (51); övrigt vetenskapligt/konstnärligt (2)

Författare/redaktör: Kahl, Fredrik, 1972 (14); Kahl, Fredrik (9); Pollefeys, Marc (9); Olsson, Carl (8); Sattler, Torsten, 19 ... (8); Olsson, Carl, 1978 (7); visa fler...; Larsson, Viktor (5); Åström, Karl (4); Iglesias, José Pedro ... (4); Enqvist, Olof (3); Hammarstrand, Lars, ... (3); Sminchisescu, Cristi ... (3); Zach, Christopher, 1 ... (3); Enqvist, Olof, 1981 (2); Berthilsson, Rikard (2); Pajdla, Tomas (2); Larsson, Viktor, 198 ... (2); Kuang, Yubin (2); Eriksson, Anders P (2); Barath, Daniel (2); Felsberg, Michael, 1 ... (2); Danelljan, Martin, 1 ... (2); Khan, Fahad Shahbaz, ... (2); Sattler, Torsten (2); Boykov, Yuri (2); Josephson, Klas (1); Solem, Jan Erik (1); Ulén, Johannes (1); Nilsson, Amanda (1); Nilsson, Mikael (1); Kukelova, Zuzana (1); Mehlig, Bernhard, 19 ... (1); Heyden, Anders (1); Linander, Hampus, 19 ... (1); Oskarsson, Magnus (1); Svärm, Linus (1); Larsson, Måns, 1989 (1); Ask, Erik (1); Haubro Andersen, Pia (1); Balabanov, Oleksandr (1); Zhang, Ganlin (1); Lin, Che-Tsung, 1979 (1); Sparr, Gunnar (1); Broomé, Sofia (1); Bech Gleerup, Karina (1); Kjellström, Hedvig, ... (1); Bökman, Georg, 1994 (1); Bylow, Erik (1); Byröd, Martin (1); Flinth, Axel, 1992 (1); visa färre...

Lärosäte: Lunds universitet (31); Chalmers tekniska högskola (31); Linköpings universitet (2); Göteborgs universitet (1); Umeå universitet (1); Kungliga Tekniska Högskolan (1); visa fler...; Sveriges Lantbruksuniversitet (1); visa färre...

Språk: Engelska (53)

Forskningsämne (UKÄ/SCB): Naturvetenskap (53); Teknik (21); Lantbruksvetenskap (1)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

LIBRIS.kb.se

Stäng

Kopiera och spara länken för att återkomma till aktuell vy