SwePub
Sök i SwePub databas

  Extended search

Träfflista för sökning "WFRF:(Danelljan Martin) "

Search: WFRF:(Danelljan Martin)

  • Result 1-25 of 55
Sort/group result
   
EnumerationReferenceCoverFind
1.
  • Kristan, Matej, et al. (author)
  • The Sixth Visual Object Tracking VOT2018 Challenge Results
  • 2019
  • In: Computer Vision – ECCV 2018 Workshops. - Cham : Springer Publishing Company. - 9783030110086 - 9783030110093 ; , s. 3-53
  • Conference paper (peer-reviewed)abstract
    • The Visual Object Tracking challenge VOT2018 is the sixth annual tracker benchmarking activity organized by the VOT initiative. Results of over eighty trackers are presented; many are state-of-the-art trackers published at major computer vision conferences or in journals in the recent years. The evaluation included the standard VOT and other popular methodologies for short-term tracking analysis and a “real-time” experiment simulating a situation where a tracker processes images as if provided by a continuously running sensor. A long-term tracking subchallenge has been introduced to the set of standard VOT sub-challenges. The new subchallenge focuses on long-term tracking properties, namely coping with target disappearance and reappearance. A new dataset has been compiled and a performance evaluation methodology that focuses on long-term tracking capabilities has been adopted. The VOT toolkit has been updated to support both standard short-term and the new long-term tracking subchallenges. Performance of the tested trackers typically by far exceeds standard baselines. The source code for most of the trackers is publicly available from the VOT page. The dataset, the evaluation kit and the results are publicly available at the challenge website (http://votchallenge.net).
  •  
2.
  • Felsberg, Michael, 1974-, et al. (author)
  • The Thermal Infrared Visual Object Tracking VOT-TIR2016 Challenge Results
  • 2016
  • In: Computer Vision – ECCV 2016 Workshops. ECCV 2016.. - Cham : SPRINGER INT PUBLISHING AG. - 9783319488813 - 9783319488806 ; , s. 824-849
  • Conference paper (peer-reviewed)abstract
    • The Thermal Infrared Visual Object Tracking challenge 2016, VOT-TIR2016, aims at comparing short-term single-object visual trackers that work on thermal infrared (TIR) sequences and do not apply pre-learned models of object appearance. VOT-TIR2016 is the second benchmark on short-term tracking in TIR sequences. Results of 24 trackers are presented. For each participating tracker, a short description is provided in the appendix. The VOT-TIR2016 challenge is similar to the 2015 challenge, the main difference is the introduction of new, more difficult sequences into the dataset. Furthermore, VOT-TIR2016 evaluation adopted the improvements regarding overlap calculation in VOT2016. Compared to VOT-TIR2015, a significant general improvement of results has been observed, which partly compensate for the more difficult sequences. The dataset, the evaluation kit, as well as the results are publicly available at the challenge website.
  •  
3.
  • Kristan, Matej, et al. (author)
  • The Visual Object Tracking VOT2016 Challenge Results
  • 2016
  • In: COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II. - Cham : SPRINGER INT PUBLISHING AG. - 9783319488813 - 9783319488806 ; , s. 777-823
  • Conference paper (peer-reviewed)abstract
    • The Visual Object Tracking challenge VOT2016 aims at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. Results of 70 trackers are presented, with a large number of trackers being published at major computer vision conferences and journals in the recent years. The number of tested state-of-the-art trackers makes the VOT 2016 the largest and most challenging benchmark on short-term tracking to date. For each participating tracker, a short description is provided in the Appendix. The VOT2016 goes beyond its predecessors by (i) introducing a new semi-automatic ground truth bounding box annotation methodology and (ii) extending the evaluation system with the no-reset experiment.
  •  
4.
  • Kristan, Matej, et al. (author)
  • The Visual Object Tracking VOT2017 challenge results
  • 2017
  • In: 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017). - : IEEE. - 9781538610343 ; , s. 1949-1972
  • Conference paper (peer-reviewed)abstract
    • The Visual Object Tracking challenge VOT2017 is the fifth annual tracker benchmarking activity organized by the VOT initiative. Results of 51 trackers are presented; many are state-of-the-art published at major computer vision conferences or journals in recent years. The evaluation included the standard VOT and other popular methodologies and a new "real-time" experiment simulating a situation where a tracker processes images as if provided by a continuously running sensor. Performance of the tested trackers typically by far exceeds standard baselines. The source code for most of the trackers is publicly available from the VOT page. The VOT2017 goes beyond its predecessors by (i) improving the VOT public dataset and introducing a separate VOT2017 sequestered dataset, (ii) introducing a realtime tracking experiment and (iii) releasing a redesigned toolkit that supports complex experiments. The dataset, the evaluation kit and the results are publicly available at the challenge website(1).
  •  
5.
  • Felsberg, Michael, et al. (author)
  • The Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge Results
  • 2015
  • In: Proceedings of the IEEE International Conference on Computer Vision. - : Institute of Electrical and Electronics Engineers (IEEE). - 9781467383905 ; , s. 639-651
  • Conference paper (peer-reviewed)abstract
    • The Thermal Infrared Visual Object Tracking challenge 2015, VOTTIR2015, aims at comparing short-term single-object visual trackers that work on thermal infrared (TIR) sequences and do not apply prelearned models of object appearance. VOT-TIR2015 is the first benchmark on short-term tracking in TIR sequences. Results of 24 trackers are presented. For each participating tracker, a short description is provided in the appendix. The VOT-TIR2015 challenge is based on the VOT2013 challenge, but introduces the following novelties: (i) the newly collected LTIR (Linköping TIR) dataset is used, (ii) the VOT2013 attributes are adapted to TIR data, (iii) the evaluation is performed using insights gained during VOT2013 and VOT2014 and is similar to VOT2015.
  •  
6.
  • Kristan, Matej, et al. (author)
  • The Visual Object Tracking VOT2015 challenge results
  • 2015
  • In: Proceedings 2015 IEEE International Conference on Computer Vision Workshops ICCVW 2015. - : IEEE. - 9780769557205 ; , s. 564-586
  • Conference paper (peer-reviewed)abstract
    • The Visual Object Tracking challenge 2015, VOT2015, aims at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. Results of 62 trackers are presented. The number of tested trackers makes VOT 2015 the largest benchmark on short-term tracking to date. For each participating tracker, a short description is provided in the appendix. Features of the VOT2015 challenge that go beyond its VOT2014 predecessor are: (i) a new VOT2015 dataset twice as large as in VOT2014 with full annotation of targets by rotated bounding boxes and per-frame attribute, (ii) extensions of the VOT2014 evaluation methodology by introduction of a new performance measure. The dataset, the evaluation kit as well as the results are publicly available at the challenge website(1).
  •  
7.
  • Bhat, Goutam, et al. (author)
  • Combining Local and Global Models for Robust Re-detection
  • 2018
  • In: Proceedings of AVSS 2018. 2018 IEEE International Conference on Advanced Video and Signal-based Surveillance, Auckland, New Zealand, 27-30 November 2018. - : Institute of Electrical and Electronics Engineers (IEEE). - 9781538692943 - 9781538692936 - 9781538692950 ; , s. 25-30
  • Conference paper (peer-reviewed)abstract
    • Discriminative Correlation Filters (DCF) have demonstrated excellent performance for visual tracking. However, these methods still struggle in occlusion and out-of-view scenarios due to the absence of a re-detection component. While such a component requires global knowledge of the scene to ensure robust re-detection of the target, the standard DCF is only trained on the local target neighborhood. In this paper, we augment the state-of-the-art DCF tracking framework with a re-detection component based on a global appearance model. First, we introduce a tracking confidence measure to detect target loss. Next, we propose a hard negative mining strategy to extract background distractors samples, used for training the global model. Finally, we propose a robust re-detection strategy that combines the global and local appearance model predictions. We perform comprehensive experiments on the challenging UAV123 and LTB35 datasets. Our approach shows consistent improvements over the baseline tracker, setting a new state-of-the-art on both datasets.
  •  
8.
  • Bhat, Goutam, et al. (author)
  • NTIRE 2022 Burst Super-Resolution Challenge
  • 2022
  • In: 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2022). - : IEEE. - 9781665487399 - 9781665487405 ; , s. 1040-1060
  • Conference paper (peer-reviewed)abstract
    • Burst super-resolution has received increased attention in recent years due to its applications in mobile photography. By merging information from multiple shifted images of a scene, burst super-resolution aims to recover details which otherwise cannot be obtained using a simple input image. This paper reviews the NTIRE 2022 challenge on burst super-resolution. In the challenge, the participants were tasked with generating a clean RGB image with 4x higher resolution, given a RAW noisy burst as input. That is, the methods need to perform joint denoising, demosaicking, and super-resolution. The challenge consisted of 2 tracks. Track 1 employed synthetic data, where pixel-accurate high-resolution ground truths are available. Track 2 on the other hand used real-world bursts captured from a handheld camera, along with approximately aligned reference images captured using a DSLR. 14 teams participated in the final testing phase. The top performing methods establish a new state-of-the-art on the burst super-resolution task.
  •  
9.
  • Bhat, Goutam, et al. (author)
  • Unveiling the power of deep tracking
  • 2018
  • In: Computer Vision – ECCV 2018. - Cham : Springer Publishing Company. - 9783030012151 - 9783030012168 ; , s. 493-509
  • Conference paper (peer-reviewed)abstract
    • In the field of generic object tracking numerous attempts have been made to exploit deep features. Despite all expectations, deep trackers are yet to reach an outstanding level of performance compared to methods solely based on handcrafted features. In this paper, we investigate this key issue and propose an approach to unlock the true potential of deep features for tracking. We systematically study the characteristics of both deep and shallow features, and their relation to tracking accuracy and robustness. We identify the limited data and low spatial resolution as the main challenges, and propose strategies to counter these issues when integrating deep features for tracking. Furthermore, we propose a novel adaptive fusion approach that leverages the complementary properties of deep and shallow features to improve both robustness and accuracy. Extensive experiments are performed on four challenging datasets. On VOT2017, our approach significantly outperforms the top performing tracker from the challenge with a relative gain of >17% in EAO.
  •  
10.
  • Brissman, Emil, 1987-, et al. (author)
  • Recurrent Graph Neural Networks for Video Instance Segmentation
  • 2023
  • In: International Journal of Computer Vision. - : Springer. - 0920-5691 .- 1573-1405. ; 131, s. 471-495
  • Journal article (peer-reviewed)abstract
    • Video instance segmentation is one of the core problems in computer vision. Formulating a purely learning-based method, which models the generic track management required to solve the video instance segmentation task, is a highly challenging problem. In this work, we propose a novel learning framework where the entire video instance segmentation problem is modeled jointly. To this end, we design a graph neural network that in each frame jointly processes all detections and a memory of previously seen tracks. Past information is considered and processed via a recurrent connection. We demonstrate the effectiveness of the proposed approach in comprehensive experiments. Our approach operates online at over 25 FPS and obtains 16.3 AP on the challenging OVIS benchmark, setting a new state-of-the-art. We further conduct detailed ablative experiments that validate the different aspects of our approach. Code is available at https://github.com/emibr948/RGNNVIS-PlusPlus.
  •  
11.
  • Danelljan, Martin, et al. (author)
  • A Low-Level Active Vision Framework for Collaborative Unmanned Aircraft Systems
  • 2015
  • In: COMPUTER VISION - ECCV 2014 WORKSHOPS, PT I. - Cham : Springer Publishing Company. - 9783319161778 - 9783319161785 ; , s. 223-237
  • Conference paper (peer-reviewed)abstract
    • Micro unmanned aerial vehicles are becoming increasingly interesting for aiding and collaborating with human agents in myriads of applications, but in particular they are useful for monitoring inaccessible or dangerous areas. In order to interact with and monitor humans, these systems need robust and real-time computer vision subsystems that allow to detect and follow persons.In this work, we propose a low-level active vision framework to accomplish these challenging tasks. Based on the LinkQuad platform, we present a system study that implements the detection and tracking of people under fully autonomous flight conditions, keeping the vehicle within a certain distance of a person. The framework integrates state-of-the-art methods from visual detection and tracking, Bayesian filtering, and AI-based control. The results from our experiments clearly suggest that the proposed framework performs real-time detection and tracking of persons in complex scenarios
  •  
12.
  • Danelljan, Martin, 1989-, et al. (author)
  • A Probabilistic Framework for Color-Based Point Set Registration
  • 2016
  • In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). - : Institute of Electrical and Electronics Engineers (IEEE). - 9781467388511 - 9781467388528 ; , s. 1818-1826
  • Conference paper (peer-reviewed)abstract
    • In recent years, sensors capable of measuring both color and depth information have become increasingly popular. Despite the abundance of colored point set data, state-of-the-art probabilistic registration techniques ignore the available color information. In this paper, we propose a probabilistic point set registration framework that exploits available color information associated with the points. Our method is based on a model of the joint distribution of 3D-point observations and their color information. The proposed model captures discriminative color information, while being computationally efficient. We derive an EM algorithm for jointly estimating the model parameters and the relative transformations. Comprehensive experiments are performed on the Stanford Lounge dataset, captured by an RGB-D camera, and two point sets captured by a Lidar sensor. Our results demonstrate a significant gain in robustness and accuracy when incorporating color information. On the Stanford Lounge dataset, our approach achieves a relative reduction of the failure rate by 78% compared to the baseline. Furthermore, our proposed model outperforms standard strategies for combining color and 3D-point information, leading to state-of-the-art results.
  •  
13.
  • Danelljan, Martin, et al. (author)
  • Accurate Scale Estimation for Robust Visual Tracking
  • 2014
  • In: Proceedings of the British Machine Vision Conference 2014. - : BMVA Press. - 1901725529
  • Conference paper (peer-reviewed)abstract
    • Robust scale estimation is a challenging problem in visual object tracking. Most existing methods fail to handle large scale variations in complex image sequences. This paper presents a novel approach for robust scale estimation in a tracking-by-detection framework. The proposed approach works by learning discriminative correlation filters based on a scale pyramid representation. We learn separate filters for translation and scale estimation, and show that this improves the performance compared to an exhaustive scale search. Our scale estimation approach is generic as it can be incorporated into any tracking method with no inherent scale estimation.Experiments are performed on 28 benchmark sequences with significant scale variations. Our results show that the proposed approach significantly improves the performance by 18.8 % in median distance precision compared to our baseline. Finally, we provide both quantitative and qualitative comparison of our approach with state-of-the-art trackers in literature. The proposed method is shown to outperform the best existing tracker by 16.6 % in median distance precision, while operating at real-time.
  •  
14.
  • Danelljan, Martin, et al. (author)
  • Adaptive Color Attributes for Real-Time Visual Tracking
  • 2014
  • In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2014. - : IEEE Computer Society. - 9781479951178 ; , s. 1090-1097
  • Conference paper (peer-reviewed)abstract
    • Visual tracking is a challenging problem in computer vision. Most state-of-the-art visual trackers either rely on luminance information or use simple color representations for image description. Contrary to visual tracking, for object recognition and detection, sophisticated color features when combined with luminance have shown to provide excellent performance. Due to the complexity of the tracking problem, the desired color feature should be computationally efficient, and possess a certain amount of photometric invariance while maintaining high discriminative power.This paper investigates the contribution of color in a tracking-by-detection framework. Our results suggest that color attributes provides superior performance for visual tracking. We further propose an adaptive low-dimensional variant of color attributes. Both quantitative and attributebased evaluations are performed on 41 challenging benchmark color sequences. The proposed approach improves the baseline intensity-based tracker by 24% in median distance precision. Furthermore, we show that our approach outperforms state-of-the-art tracking methods while running at more than 100 frames per second.
  •  
15.
  • Danelljan, Martin, 1989-, et al. (author)
  • Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking
  • 2016
  • In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). - : Institute of Electrical and Electronics Engineers (IEEE). - 9781467388511 - 9781467388528 ; , s. 1430-1438
  • Conference paper (peer-reviewed)abstract
    • Tracking-by-detection methods have demonstrated competitive performance in recent years. In these approaches, the tracking model heavily relies on the quality of the training set. Due to the limited amount of labeled training data, additional samples need to be extracted and labeled by the tracker itself. This often leads to the inclusion of corrupted training samples, due to occlusions, misalignments and other perturbations. Existing tracking-by-detection methods either ignore this problem, or employ a separate component for managing the training set. We propose a novel generic approach for alleviating the problem of corrupted training samples in tracking-by-detection frameworks. Our approach dynamically manages the training set by estimating the quality of the samples. Contrary to existing approaches, we propose a unified formulation by minimizing a single loss over both the target appearance model and the sample quality weights. The joint formulation enables corrupted samples to be down-weighted while increasing the impact of correct ones. Experiments are performed on three benchmarks: OTB-2015 with 100 videos, VOT-2015 with 60 videos, and Temple-Color with 128 videos. On the OTB-2015, our unified formulation significantly improves the baseline, with a gain of 3.8% in mean overlap precision. Finally, our method achieves state-of-the-art results on all three datasets.
  •  
16.
  • Danelljan, Martin, 1989-, et al. (author)
  • Aligning the Dissimilar: A Probabilistic Feature-Based Point Set Registration Approach
  • 2016
  • In: Proceedings of the 23rd International Conference on Pattern Recognition (ICPR) 2016. - : IEEE. - 9781509048472 - 9781509048489 ; , s. 247-252
  • Conference paper (peer-reviewed)abstract
    • 3D-point set registration is an active area of research in computer vision. In recent years, probabilistic registration approaches have demonstrated superior performance for many challenging applications. Generally, these probabilistic approaches rely on the spatial distribution of the 3D-points, and only recently color information has been integrated into such a framework, significantly improving registration accuracy. Other than local color information, high-dimensional 3D shape features have been successfully employed in many applications such as action recognition and 3D object recognition. In this paper, we propose a probabilistic framework to integrate high-dimensional 3D shape features with color information for point set registration. The 3D shape features are distinctive and provide complementary information beneficial for robust registration. We validate our proposed framework by performing comprehensive experiments on the challenging Stanford Lounge dataset, acquired by a RGB-D sensor, and an outdoor dataset captured by a Lidar sensor. The results clearly demonstrate that our approach provides superior results both in terms of robustness and accuracy compared to state-of-the-art probabilistic methods.
  •  
17.
  • Danelljan, Martin, 1989-, et al. (author)
  • ATOM: Accurate tracking by overlap maximization
  • 2019
  • In: 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019). - : IEEE. - 9781728132938 ; , s. 4655-4664
  • Conference paper (peer-reviewed)abstract
    • While recent years have witnessed astonishing improvements in visual tracking robustness, the advancements in tracking accuracy have been limited. As the focus has been directed towards the development of powerful classifiers, the problem of accurate target state estimation has been largely overlooked. In fact, most trackers resort to a simple multi-scale search in order to estimate the target bounding box. We argue that this approach is fundamentally limited since target estimation is a complex task, requiring highlevel knowledge about the object. We address this problem by proposing a novel tracking architecture, consisting of dedicated target estimation and classification components. High level knowledge is incorporated into the target estimation through extensive offline learning. Our target estimation component is trained to predict the overlap between the target object and an estimated bounding box. By carefully integrating targetspecific information, our approach achieves previously unseen bounding box accuracy. We further introduce a classification component that is trained online to guarantee high discriminative power in the presence of distractors. Our final tracking framework sets a new state-of-the-art on five challenging benchmarks. On the new large-scale TrackingNet dataset, our tracker ATOM achieves a relative gain of 15% over the previous best approach, while running at over 30 FPS. Code and models are available at https://github.com/visionml/pytracking.
  •  
18.
  • Danelljan, Martin, 1989-, et al. (author)
  • Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking
  • 2016
  • In: Computer Vision – ECCV 2016. - Cham : Springer. - 9783319464534 - 9783319464541 ; , s. 472-488
  • Conference paper (peer-reviewed)abstract
    • Discriminative Correlation Filters (DCF) have demonstrated excellent performance for visual object tracking. The key to their success is the ability to efficiently exploit available negative data by including all shifted versions of a training sample. However, the underlying DCF formulation is restricted to single-resolution feature maps, significantly limiting its potential. In this paper, we go beyond the conventional DCF framework and introduce a novel formulation for training continuous convolution filters. We employ an implicit interpolation model to pose the learning problem in the continuous spatial domain. Our proposed formulation enables efficient integration of multi-resolution deep feature maps, leading to superior results on three object tracking benchmarks: OTB-2015 (+5.1% in mean OP), Temple-Color (+4.6% in mean OP), and VOT2015 (20% relative reduction in failure rate). Additionally, our approach is capable of sub-pixel localization, crucial for the task of accurate feature point tracking. We also demonstrate the effectiveness of our learning formulation in extensive feature point tracking experiments.
  •  
19.
  • Danelljan, Martin, 1989-, et al. (author)
  • Coloring Channel Representations for Visual Tracking
  • 2015
  • In: 19th Scandinavian Conference, SCIA 2015, Copenhagen, Denmark, June 15-17, 2015. Proceedings. - Cham : Springer. - 9783319196640 - 9783319196657 ; , s. 117-129
  • Conference paper (peer-reviewed)abstract
    • Visual object tracking is a classical, but still open research problem in computer vision, with many real world applications. The problem is challenging due to several factors, such as illumination variation, occlusions, camera motion and appearance changes. Such problems can be alleviated by constructing robust, discriminative and computationally efficient visual features. Recently, biologically-inspired channel representations \cite{felsberg06PAMI} have shown to provide promising results in many applications ranging from autonomous driving to visual tracking.This paper investigates the problem of coloring channel representations for visual tracking. We evaluate two strategies, channel concatenation and channel product, to construct channel coded color representations. The proposed channel coded color representations are generic and can be used beyond tracking.Experiments are performed on 41 challenging benchmark videos. Our experiments clearly suggest that a careful selection of color feature together with an optimal fusion strategy, significantly outperforms the standard luminance based channel representation. Finally, we show promising results compared to state-of-the-art tracking methods in the literature.
  •  
20.
  • Danelljan, Martin, et al. (author)
  • Convolutional Features for Correlation Filter Based Visual Tracking
  • 2015
  • In: 2015 IEEE International Conference on Computer Vision Workshop (ICCVW). - : IEEE conference proceedings. - 9781467397117 - 9781467397100 ; , s. 621-629
  • Conference paper (peer-reviewed)abstract
    • Visual object tracking is a challenging computer vision problem with numerous real-world applications. This paper investigates the impact of convolutional features for the visual tracking problem. We propose to use activations from the convolutional layer of a CNN in discriminative correlation filter based tracking frameworks. These activations have several advantages compared to the standard deep features (fully connected layers). Firstly, they mitigate the need of task specific fine-tuning. Secondly, they contain structural information crucial for the tracking problem. Lastly, these activations have low dimensionality. We perform comprehensive experiments on three benchmark datasets: OTB, ALOV300++ and the recently introduced VOT2015. Surprisingly, different to image classification, our results suggest that activations from the first layer provide superior tracking performance compared to the deeper layers. Our results further show that the convolutional features provide improved results compared to standard handcrafted features. Finally, results comparable to state-of-theart trackers are obtained on all three benchmark datasets.
  •  
21.
  • Danelljan, Martin, 1989-, et al. (author)
  • Deep motion and appearance cues for visual tracking
  • 2019
  • In: Pattern Recognition Letters. - : Elsevier. - 0167-8655 .- 1872-7344. ; 124, s. 74-81
  • Journal article (peer-reviewed)abstract
    • Generic visual tracking is a challenging computer vision problem, with numerous applications. Most existing approaches rely on appearance information by employing either hand-crafted features or deep RGB features extracted from convolutional neural networks. Despite their success, these approaches struggle in case of ambiguous appearance information, leading to tracking failure. In such cases, we argue that motion cue provides discriminative and complementary information that can improve tracking performance. Contrary to visual tracking, deep motion features have been successfully applied for action recognition and video classification tasks. Typically, the motion features are learned by training a CNN on optical flow images extracted from large amounts of labeled videos. In this paper, we investigate the impact of deep motion features in a tracking-by-detection framework. We also evaluate the fusion of hand-crafted, deep RGB, and deep motion features and show that they contain complementary information. To the best of our knowledge, we are the first to propose fusing appearance information with deep motion features for visual tracking. Comprehensive experiments clearly demonstrate that our fusion approach with deep motion features outperforms standard methods relying on appearance information alone.
  •  
22.
  • Danelljan, Martin, 1989-, et al. (author)
  • Discriminative Scale Space Tracking
  • 2017
  • In: IEEE Transactions on Pattern Analysis and Machine Intelligence. - : IEEE COMPUTER SOC. - 0162-8828 .- 1939-3539. ; 39:8, s. 1561-1575
  • Journal article (peer-reviewed)abstract
    • Accurate scale estimation of a target is a challenging research problem in visual object tracking. Most state-of-the-art methods employ an exhaustive scale search to estimate the target size. The exhaustive search strategy is computationally expensive and struggles when encountered with large scale variations. This paper investigates the problem of accurate and robust scale estimation in a tracking-by-detection framework. We propose a novel scale adaptive tracking approach by learning separate discriminative correlation filters for translation and scale estimation. The explicit scale filter is learned online using the target appearance sampled at a set of different scales. Contrary to standard approaches, our method directly learns the appearance change induced by variations in the target scale. Additionally, we investigate strategies to reduce the computational cost of our approach. Extensive experiments are performed on the OTB and the VOT2014 datasets. Compared to the standard exhaustive scale search, our approach achieves a gain of 2.5 percent in average overlap precision on the OTB dataset. Additionally, our method is computationally efficient, operating at a 50 percent higher frame rate compared to the exhaustive scale search. Our method obtains the top rank in performance by outperforming 19 state-of-the-art trackers on OTB and 37 state-of-the-art trackers on VOT2014.
  •  
23.
  • Danelljan, Martin, 1989-, et al. (author)
  • ECO: Efficient Convolution Operators for Tracking
  • 2017
  • In: Proceedings 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). - : Institute of Electrical and Electronics Engineers (IEEE). - 9781538604571 - 9781538604588 ; , s. 6931-6939
  • Conference paper (peer-reviewed)abstract
    • In recent years, Discriminative Correlation Filter (DCF) based methods have significantly advanced the state-of-the-art in tracking. However, in the pursuit of ever increasing tracking performance, their characteristic speed and real-time capability have gradually faded. Further, the increasingly complex models, with massive number of trainable parameters, have introduced the risk of severe over-fitting. In this work, we tackle the key causes behind the problems of computational complexity and over-fitting, with the aim of simultaneously improving both speed and performance. We revisit the core DCF formulation and introduce: (i) a factorized convolution operator, which drastically reduces the number of parameters in the model; (ii) a compact generative model of the training sample distribution, that significantly reduces memory and time complexity, while providing better diversity of samples; (iii) a conservative model update strategy with improved robustness and reduced complexity. We perform comprehensive experiments on four benchmarks: VOT2016, UAV123, OTB-2015, and Temple-Color. When using expensive deep features, our tracker provides a 20-fold speedup and achieves a 13.0% relative gain in Expected Average Overlap compared to the top ranked method [12] in the VOT2016 challenge. Moreover, our fast variant, using hand-crafted features, operates at 60 Hz on a single CPU, while obtaining 65.0% AUC on OTB-2015.
  •  
24.
  • Danelljan, Martin, 1989- (author)
  • Learning Convolution Operators for Visual Tracking
  • 2018
  • Doctoral thesis (other academic/artistic)abstract
    • Visual tracking is one of the fundamental problems in computer vision. Its numerous applications include robotics, autonomous driving, augmented reality and 3D reconstruction. In essence, visual tracking can be described as the problem of estimating the trajectory of a target in a sequence of images. The target can be any image region or object of interest. While humans excel at this task, requiring little effort to perform accurate and robust visual tracking, it has proven difficult to automate. It has therefore remained one of the most active research topics in computer vision.In its most general form, no prior knowledge about the object of interest or environment is given, except for the initial target location. This general form of tracking is known as generic visual tracking. The unconstrained nature of this problem makes it particularly difficult, yet applicable to a wider range of scenarios. As no prior knowledge is given, the tracker must learn an appearance model of the target on-the-fly. Cast as a machine learning problem, it imposes several major challenges which are addressed in this thesis.The main purpose of this thesis is the study and advancement of the, so called, Discriminative Correlation Filter (DCF) framework, as it has shown to be particularly suitable for the tracking application. By utilizing properties of the Fourier transform, a correlation filter is discriminatively learned by efficiently minimizing a least-squares objective. The resulting filter is then applied to a new image in order to estimate the target location.This thesis contributes to the advancement of the DCF methodology in several aspects. The main contribution regards the learning of the appearance model: First, the problem of updating the appearance model with new training samples is covered. Efficient update rules and numerical solvers are investigated for this task. Second, the periodic assumption induced by the circular convolution in DCF is countered by proposing a spatial regularization component. Third, an adaptive model of the training set is proposed to alleviate the impact of corrupted or mislabeled training samples. Fourth, a continuous-space formulation of the DCF is introduced, enabling the fusion of multiresolution features and sub-pixel accurate predictions. Finally, the problems of computational complexity and overfitting are addressed by investigating dimensionality reduction techniques.As a second contribution, different feature representations for tracking are investigated. A particular focus is put on the analysis of color features, which had been largely overlooked in prior tracking research. This thesis also studies the use of deep features in DCF-based tracking. While many vision problems have greatly benefited from the advent of deep learning, it has proven difficult to harvest the power of such representations for tracking. In this thesis it is shown that both shallow and deep layers contribute positively. Furthermore, the problem of fusing their complementary properties is investigated.The final major contribution of this thesis regards the prediction of the target scale. In many applications, it is essential to track the scale, or size, of the target since it is strongly related to the relative distance. A thorough analysis of how to integrate scale estimation into the DCF framework is performed. A one-dimensional scale filter is proposed, enabling efficient and accurate scale estimation.
  •  
25.
  • Danelljan, Martin, et al. (author)
  • Learning Spatially Regularized Correlation Filters for Visual Tracking
  • 2015
  • In: Proceedings of the International Conference in Computer Vision (ICCV), 2015. - : IEEE Computer Society. - 9781467383905 ; , s. 4310-4318
  • Conference paper (peer-reviewed)abstract
    • Robust and accurate visual tracking is one of the most challenging computer vision problems. Due to the inherent lack of training data, a robust approach for constructing a target appearance model is crucial. Recently, discriminatively learned correlation filters (DCF) have been successfully applied to address this problem for tracking. These methods utilize a periodic assumption of the training samples to efficiently learn a classifier on all patches in the target neighborhood. However, the periodic assumption also introduces unwanted boundary effects, which severely degrade the quality of the tracking model.We propose Spatially Regularized Discriminative Correlation Filters (SRDCF) for tracking. A spatial regularization component is introduced in the learning to penalize correlation filter coefficients depending on their spatial location. Our SRDCF formulation allows the correlation filters to be learned on a significantly larger set of negative training samples, without corrupting the positive samples. We further propose an optimization strategy, based on the iterative Gauss-Seidel method, for efficient online learning of our SRDCF. Experiments are performed on four benchmark datasets: OTB-2013, ALOV++, OTB-2015, and VOT2014. Our approach achieves state-of-the-art results on all four datasets. On OTB-2013 and OTB-2015, we obtain an absolute gain of 8.0% and 8.2% respectively, in mean overlap precision, compared to the best existing trackers.
  •  
Skapa referenser, mejla, bekava och länka
  • Result 1-25 of 55
Type of publication
conference paper (46)
journal article (6)
doctoral thesis (3)
Type of content
peer-reviewed (52)
other academic/artistic (3)
Author/Editor
Danelljan, Martin (30)
Felsberg, Michael, 1 ... (27)
Danelljan, Martin, 1 ... (23)
Khan, Fahad Shahbaz, ... (22)
Felsberg, Michael (16)
Bhat, Goutam (15)
show more...
Häger, Gustav (11)
Matas, Jiri (11)
Fernandez, Gustavo (10)
Kristan, Matej (9)
Leonardis, Ales (9)
Lukezic, Alan (9)
Khan, Fahad (8)
Pflugfelder, Roman (8)
van de Weijer, Joost (7)
Schön, Thomas B., Pr ... (7)
Vojır, Tomas (7)
Bertinetto, Luca (7)
Golodetz, Stuart (7)
Järemo-Lawin, Felix (7)
Gustafsson, Fredrik ... (6)
Li, Yang (6)
Torr, Philip H.S. (6)
Johnander, Joakim (6)
Khan, Fahad Shahbaz (6)
Bowden, Richard (6)
Zhu, Jianke (6)
Martinez, Jose M. (6)
Wen, Longyin (6)
Miksik, Ondrej (6)
Martin-Nieto, Rafael (6)
Petrosino, Alfredo (6)
Hadfield, Simon (6)
Lu, Huchuan (6)
Li, Xin (5)
Timofte, Radu (5)
Van Gool, Luc (5)
Becker, Stefan (5)
Tang, Ming (5)
Robinson, Andreas, 1 ... (5)
Cehovin, Luka (5)
Du, Dawei (5)
Arens, Michael (5)
Lyu, Siwei (5)
Possegger, Horst (5)
Valmadre, Jack (5)
Palaniappan, Kannapp ... (5)
Lebeda, Karel (5)
He, Zhenyu (5)
Zajc, Luka Čehovin (5)
show less...
University
Linköping University (47)
Uppsala University (7)
Umeå University (1)
Royal Institute of Technology (1)
Language
English (55)
Research subject (UKÄ/SCB)
Natural sciences (51)
Engineering and Technology (4)
Medical and Health Sciences (1)
Social Sciences (1)

Year

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view