SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Khan Fahad Shahbaz) "

Sökning: WFRF:(Khan Fahad Shahbaz)

  • Resultat 1-25 av 64
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Bhunia, Ankan Kumar, et al. (författare)
  • Handwriting Transformers
  • 2021
  • Ingår i: 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021). - : IEEE. - 9781665428125 - 9781665428132 ; , s. 1066-1074
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • We propose a novel transformer-based styled handwritten text image generation approach, HWT, that strives to learn both style-content entanglement as well as global and local writing style patterns. The proposed HWT captures the long and short range relationships within the style examples through a self-attention mechanism, thereby encoding both global and local style patterns. Further, the proposed transformer-based HWT comprises an encoder-decoder attention that enables style-content entanglement by gathering the style representation of each query character. To the best of our knowledge, we are the first to introduce a transformer-based generative network for styled handwritten text generation. Our proposed HWT generates realistic styled handwritten text images and significantly outperforms the state-of-the-art demonstrated through extensive qualitative, quantitative and human-based evaluations. The proposed HWT can handle arbitrary length of text and any desired writing style in a few-shot setting. Further, our HWT generalizes well to the challenging scenario where both words and writing style are unseen during training, generating realistic styled handwritten text images.
  •  
2.
  • Joseph, KJ, et al. (författare)
  • Towards Open World Object Detection
  • 2021
  • Ingår i: 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021. - : IEEE COMPUTER SOC. - 9781665445092 ; , s. 5826-5836
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • Humans have a natural instinct to identify unknown object instances in their environments. The intrinsic curiosityabout these unknown instances aids in learning about them,when the corresponding knowledge is eventually available.This motivates us to propose a novel computer vision problem called: ‘Open World Object Detection’, where a modelis tasked to: 1) identify objects that have not been introduced to it as ‘unknown’, without explicit supervision to doso, and 2) incrementally learn these identified unknown categories without forgetting previously learned classes, whenthe corresponding labels are progressively received. Weformulate the problem, introduce a strong evaluation protocol and provide a novel solution, which we call ORE:Open World Object Detector, based on contrastive clustering and energy based unknown identification. Our experimental evaluation and ablation studies analyse the efficacyof ORE in achieving Open World objectives. As an interesting by-product, we find that identifying and characterisingunknown instances helps to reduce confusion in an incremental object detection setting, where we achieve state-ofthe-art performance, with no extra methodological effort.We hope that our work will attract further research into thisnewly identified, yet crucial research direction.
  •  
3.
  • Khan, Fahad Shabhaz, et al. (författare)
  • Data Mining in Oral Medicine Using Decision Trees
  • 2008
  • Ingår i: Proceedings of the 5th International Conference on Computer, Electrical, and Systems Science, and Engineering (CESSE 2008), Cairo, Egypt, February 6–8, 2008. - : World Academy of Science Engineering and Technology - WASET. ; 27, s. 225-230
  • Konferensbidrag (refereegranskat)abstract
    • Data mining has been used very frequently to extract hidden information from large databases. This paper suggests the use of decision trees for continuously extracting the clinical reasoning in the form of medical expert’s actions that is inherent in large number of EMRs (Electronic Medical records). In this way the extracted data could be used to teach students of oral medicine a number of orderly processes for dealing with patients who represent with different problems within the practice context over time.
  •  
4.
  • Khan, Rahat, et al. (författare)
  • Discriminative Color Descriptors
  • 2013
  • Ingår i: Computer Vision and Pattern Recognition (CVPR), 2013. - : IEEE Computer Society. ; , s. 2866-2873
  • Konferensbidrag (refereegranskat)abstract
    • Color description is a challenging task because of large variations in RGB values which occur due to scene accidental events, such as shadows, shading, specularities, illuminant color changes, and changes in viewing geometry. Traditionally, this challenge has been addressed by capturing the variations in physics-based models, and deriving invariants for the undesired variations. The drawback of this approach is that sets of distinguishable colors in the original color space are mapped to the same value in the photometric invariant space. This results in a drop of discriminative power of the color description. In this paper we take an information theoretic approach to color description. We cluster color values together based on their discriminative power in a classification problem. The clustering has the explicit objective to minimize the drop of mutual information of the final representation. We show that such a color description automatically learns a certain degree of photometric invariance. We also show that a universal color representation, which is based on other data sets than the one at hand, can obtain competing performance. Experiments show that the proposed descriptor outperforms existing photometric invariants. Furthermore, we show that combined with shape description these color descriptors obtain excellent results on four challenging datasets, namely, PASCAL VOC 2007, Flowers-102, Stanford dogs-120 and Birds-200.
  •  
5.
  • Narayan, Sanath, et al. (författare)
  • Discriminative Region-based Multi-Label Zero-Shot Learning
  • 2021
  • Ingår i: 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021). - : IEEE. - 9781665428125 ; , s. 8711-8720
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • Multi-label zero-shot learning (ZSL) is a more realistic counter-part of standard single-label ZSL since several objects can co-exist in a natural image. However, the occurrence of multiple objects complicates the reasoning and requires region-specific processing of visual features to preserve their contextual cues. We note that the best existing multi-label ZSL method takes a shared approach towards attending to region features with a common set of attention maps for all the classes. Such shared maps lead to diffused attention, which does not discriminatively focus on relevant locations when the number of classes are large. Moreover, mapping spatially-pooled visual features to the class semantics leads to inter-class feature entanglement, thus hampering the classification. Here, we propose an alternate approach towards region-based discriminability-preserving multi-label zero-shot classification. Our approach maintains the spatial resolution to preserve region-level characteristics and utilizes a bi-level attention module (BiAM) to enrich the features by incorporating both region and scene context information. The enriched region-level features are then mapped to the class semantics and only their class predictions are spatially pooled to obtain image-level predictions, thereby keeping the multi-class features disentangled. Our approach sets a new state of the art on two large-scale multi-label zero-shot benchmarks: NUS-WIDE and Open Images. On NUS-WIDE, our approach achieves an absolute gain of 6.9% mAP for ZSL, compared to the best published results.
  •  
6.
  • Naseer, M., et al. (författare)
  • A Self-supervised Approach for Adversarial Robustness
  • 2020
  • Ingår i: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). - : IEEE. - 9781728171685 ; , s. 259-268
  • Konferensbidrag (refereegranskat)abstract
    • Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems e.g., for classification, segmentation and object detection. The vulnerability of DNNs against such attacks can prove a major roadblock towards their real-world deployment. Transferability of adversarial examples demand generalizable defenses that can provide cross-task protection. Adversarial training that enhances robustness by modifying target model’s parameters lacks such generalizability. On the other hand, different input processing based defenses fall short in the face of continuously evolving attacks. In this paper, we take the first step to combine the benefits of both approaches and propose a self-supervised adversarial training mechanism in the input space. By design, our defense is a generalizable approach and provides significant robustness against the unseen adversarial attacks (\eg by reducing the success rate of translation-invariant ensemble attack from 82.6% to 31.9% in comparison to previous state-of-the-art). It can be deployed as a plug-and-play solution to protect a variety of vision systems, as we demonstrate for the case of classification, segmentation and detection.
  •  
7.
  • Naseer, Muzammal, et al. (författare)
  • On Generating Transferable Targeted Perturbations
  • 2021
  • Ingår i: 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021). - : IEEE. - 9781665428125 - 9781665428132 ; , s. 7688-7697
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • While the untargeted black-box transferability of adversarial perturbations has been extensively studied before, changing an unseen model's decisions to a specific `targeted' class remains a challenging feat. In this paper, we propose a new generative approach for highly transferable targeted perturbations (\ours). We note that the existing methods are less suitable for this task due to their reliance on class-boundary information that changes from one model to another, thus reducing transferability. In contrast, our approach matches the perturbed image `distribution' with that of the target class, leading to high targeted transferability rates. To this end, we propose a new objective function that not only aligns the global distributions of source and target images, but also matches the local neighbourhood structure between the two domains. Based on the proposed objective, we train a generator function that can adaptively synthesize perturbations specific to a given input. Our generative approach is independent of the source or target domain labels, while consistently performs well against state-of-the-art methods on a wide range of attack settings. As an example, we achieve 32.63% target transferability from (an adversarially weak) VGG19BN to (a strong) WideResNet on ImageNet val. set, which is 4× higher than the previous best generative attack and 16× better than instance-specific iterative attack. 
  •  
8.
  •  
9.
  • Pang, Yanwei, et al. (författare)
  • Mask-Guided Attention Network for Occluded Pedestrian Detection
  • 2019
  • Ingår i: 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019). - : IEEE COMPUTER SOC. - 9781728148038 ; , s. 4966-4974
  • Konferensbidrag (refereegranskat)abstract
    • Pedestrian detection relying on deep convolution neural networks has made significant progress. Though promising results have been achieved on standard pedestrians, the performance on heavily occluded pedestrians remains far from satisfactory. The main culprits are intra-class occlusions involving other pedestrians and inter-class occlusions caused by other objects, such as cars and bicycles. These result in a multitude of occlusion patterns. We propose an approach for occluded pedestrian detection with the following contributions. First, we introduce a novel mask-guided attention network that fits naturally into popular pedestrian detection pipelines. Our attention network emphasizes on visible pedestrian regions while suppressing the occluded ones by modulating full body features. Second, we empirically demonstrate that coarse-level segmentation annotations provide reasonable approximation to their dense pixel-wise counterparts. Experiments are performed on CityPersons and Caltech datasets. Our approach sets a new state-of-the-art on both datasets. Our approach obtains an absolute gain of 9.5% in log-average miss rate, compared to the best reported results [31] on the heavily occluded HO pedestrian set of CityPersons test set. Further, on the HO pedestrian set of Caltech dataset, our method achieves an absolute gain of 5.0% in log-average miss rate, compared to the best reported results [13]. Code and models are available at: https://github.com/Leotju/MGAN.
  •  
10.
  • Rajasegaran, J., et al. (författare)
  • iTAML : An Incremental Task-Agnostic Meta-learning Approach
  • 2020
  • Ingår i: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). - : IEEE. - 9781728171685 ; , s. 13585-13594
  • Konferensbidrag (refereegranskat)abstract
    • Humans can continuously learn new knowledge as their experience grows. In contrast, previous learning in deep neural networks can quickly fade out when they are trained on a new task. In this paper, we hypothesize this problem can be avoided by learning a set of generalized parameters, that are neither specific to old nor new tasks. In this pursuit, we introduce a novel meta-learning approach that seeks to maintain an equilibrium between all the encountered tasks. This is ensured by a new meta-update rule which avoids catastrophic forgetting. In comparison to previous meta-learning techniques, our approach is task-agnostic. When presented with a continuum of data, our model automatically identifies the task and quickly adapts to it with just a single update. We perform extensive experiments on five datasets in a class-incremental setting, leading to significant improvements over the state of the art methods (e.g., a 21.3% boost on CIFAR100 with 10 incremental tasks). Specifically, on large-scale datasets that generally prove difficult cases for incremental learning, our approach delivers absolute gains as high as 19.1% and 7.4% on ImageNet and MS-Celeb datasets, respectively.
  •  
11.
  • Ranasinghe, Kanchana, et al. (författare)
  • Orthogonal Projection Loss
  • 2021
  • Ingår i: 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021). - : IEEE. - 9781665428125 ; , s. 12313-12323
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • Deep neural networks have achieved remarkable performance on a range of classification tasks, with softmax cross-entropy (CE) loss emerging as the de-facto objective function. The CE loss encourages features of a class to have a higher projection score on the true class-vector compared to the negative classes. However, this is a relative constraint and does not explicitly force different class features to be well-separated. Motivated by the observation that ground-truth class representations in CE loss are orthogonal (one-hot encoded vectors), we develop a novel loss function termed `Orthogonal Projection Loss' (OPL) which imposes orthogonality in the feature space. OPL augments the properties of CE loss and directly enforces inter-class separation alongside intra-class clustering in the feature space through orthogonality constraints on the mini-batch level. As compared to other alternatives of CE, OPL offers unique advantages e.g., no additional learnable parameters, does not require careful negative mining and is not sensitive to the batch size. Given the plug-and-play nature of OPL, we evaluate it on a diverse range of tasks including image recognition (CIFAR-100), large-scale classification (ImageNet), domain generalization (PACS) and few-shot learning (miniImageNet, CIFAR-FS, tiered-ImageNet and Meta-dataset) and demonstrate its effectiveness across the board. Furthermore, OPL offers better robustness against practical nuisances such as adversarial attacks and label noise. 
  •  
12.
  • Wang, Y., et al. (författare)
  • Semi-Supervised Learning for Few-Shot Image-to-Image Translation
  • 2020
  • Ingår i: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). - : IEEE. - 9781728171685 ; , s. 4452-4461
  • Konferensbidrag (refereegranskat)abstract
    • In the last few years, unpaired image-to-image translation has witnessed Remarkable progress. Although the latest methods are able to generate realistic images, they crucially rely on a large number of labeled images. Recently, some methods have tackled the challenging setting of few-shot image-to-image ranslation, reducing the labeled data requirements for the target domain during inference. In this work, we go one step further and reduce the amount of required labeled data also from the source domain during training. To do so, we propose applying semi-supervised learning via a noise-tolerant pseudo-labeling procedure. We also apply a cycle consistency constraint to further exploit the information from unlabeled images, either from the same dataset or external. Additionally, we propose several structural modifications to facilitate the image translation task under these circumstances. Our semi-supervised method for few-shot image translation, called SEMIT, achieves excellent results on four different datasets using as little as 10% of the source labels, and matches the performance of the main fully-supervised competitor using only 20% labeled data. Our code and models are made public at: https://github.com/yaxingwang/SEMIT.
  •  
13.
  • Abbafati, Cristiana, et al. (författare)
  • 2020
  • Tidskriftsartikel (refereegranskat)
  •  
14.
  • Bhat, Goutam, et al. (författare)
  • Combining Local and Global Models for Robust Re-detection
  • 2018
  • Ingår i: Proceedings of AVSS 2018. 2018 IEEE International Conference on Advanced Video and Signal-based Surveillance, Auckland, New Zealand, 27-30 November 2018. - : Institute of Electrical and Electronics Engineers (IEEE). - 9781538692943 - 9781538692936 - 9781538692950 ; , s. 25-30
  • Konferensbidrag (refereegranskat)abstract
    • Discriminative Correlation Filters (DCF) have demonstrated excellent performance for visual tracking. However, these methods still struggle in occlusion and out-of-view scenarios due to the absence of a re-detection component. While such a component requires global knowledge of the scene to ensure robust re-detection of the target, the standard DCF is only trained on the local target neighborhood. In this paper, we augment the state-of-the-art DCF tracking framework with a re-detection component based on a global appearance model. First, we introduce a tracking confidence measure to detect target loss. Next, we propose a hard negative mining strategy to extract background distractors samples, used for training the global model. Finally, we propose a robust re-detection strategy that combines the global and local appearance model predictions. We perform comprehensive experiments on the challenging UAV123 and LTB35 datasets. Our approach shows consistent improvements over the baseline tracker, setting a new state-of-the-art on both datasets.
  •  
15.
  • Bhat, Goutam, et al. (författare)
  • Unveiling the power of deep tracking
  • 2018
  • Ingår i: Computer Vision – ECCV 2018. - Cham : Springer Publishing Company. - 9783030012151 - 9783030012168 ; , s. 493-509
  • Konferensbidrag (refereegranskat)abstract
    • In the field of generic object tracking numerous attempts have been made to exploit deep features. Despite all expectations, deep trackers are yet to reach an outstanding level of performance compared to methods solely based on handcrafted features. In this paper, we investigate this key issue and propose an approach to unlock the true potential of deep features for tracking. We systematically study the characteristics of both deep and shallow features, and their relation to tracking accuracy and robustness. We identify the limited data and low spatial resolution as the main challenges, and propose strategies to counter these issues when integrating deep features for tracking. Furthermore, we propose a novel adaptive fusion approach that leverages the complementary properties of deep and shallow features to improve both robustness and accuracy. Extensive experiments are performed on four challenging datasets. On VOT2017, our approach significantly outperforms the top performing tracker from the challenge with a relative gain of >17% in EAO.
  •  
16.
  • Cao, Jiale, et al. (författare)
  • From Handcrafted to Deep Features for Pedestrian Detection : A Survey
  • 2022
  • Ingår i: IEEE Transactions on Pattern Analysis and Machine Intelligence. - New York : IEEE. - 0162-8828 .- 1939-3539. ; 44:9, s. 4913-4934
  • Tidskriftsartikel (refereegranskat)abstract
    • Pedestrian detection is an important but challenging problem in computer vision, especially in human-centric tasks. Over the past decade, significant improvement has been witnessed with the help of handcrafted features and deep features. Here we present a comprehensive survey on recent advances in pedestrian detection. First, we provide a detailed review of single-spectral pedestrian detection that includes handcrafted features based methods and deep features based approaches. For handcrafted features based methods, we present an extensive review of approaches and find that handcrafted features with large freedom degrees in shape and space have better performance. In the case of deep features based approaches, we split them into pure CNN based methods and those employing both handcrafted and CNN based features. We give the statistical analysis and tendency of these methods, where feature enhanced, part-aware, and post-processing methods have attracted main attention. In addition to single-spectral pedestrian detection, we also review multi-spectral pedestrian detection, which provides more robust features for illumination variance. Furthermore, we introduce some related datasets and evaluation metrics, and a deep experimental analysis. We conclude this survey by emphasizing open problems that need to be addressed and highlighting various future directions. Researchers can track an up-to-date list at https://github.com/JialeCao001/PedSurvey.
  •  
17.
  • Cao, Jiale, et al. (författare)
  • SipMaskv2: Enhanced Fast Image and Video Instance Segmentation
  • 2023
  • Ingår i: IEEE Transactions on Pattern Analysis and Machine Intelligence. - : IEEE. - 0162-8828 .- 1939-3539 .- 2160-9292. ; 45:3, s. 3798-3812
  • Tidskriftsartikel (refereegranskat)abstract
    • We propose a fast single-stage method for both image and video instance segmentation, called SipMask, that preserves the instance spatial information by performing multiple sub-region mask predictions. The main module in our method is a light-weight spatial preservation (SP) module that generates a separate set of spatial coefficients for the sub-regions within a bounding-box, enabling a better delineation of spatially adjacent instances. To better correlate mask prediction with object detection, we further propose a mask alignment weighting loss and a feature alignment scheme. In addition, we identify two issues that impede the performance of single-stage instance segmentation and introduce two modules, including a sample selection scheme and an instance refinement module, to address these two issues. Experiments are performed on both image instance segmentation dataset MS COCO and video instance segmentation dataset YouTube-VIS. On MS COCO test-dev set, our method achieves a state-of-the-art performance. In terms of real-time capabilities, it outperforms YOLACT by a gain of 3.0% (mask AP) under the similar settings, while operating at a comparable speed. On YouTube-VIS validation set, our method also achieves promising results. The source code is available at https://github.com/JialeCao001/SipMask.
  •  
18.
  • Cholakkal, Hisham, et al. (författare)
  • Object Counting and Instance Segmentation with Image-level Supervision
  • 2019
  • Ingår i: 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), Long Beach, CA, JUN 16-20, 2019. - : IEEE. - 9781728132938 ; , s. 12389-12397
  • Konferensbidrag (refereegranskat)abstract
    • Common object counting in a natural scene is a challenging problem in computer vision with numerous real-world applications. Existing image-level supervised common object counting approaches only predict the global object count and rely on additional instance-level supervision to also determine object locations. We propose an image-level supervised approach that provides both the global object count and the spatial distribution of object instances by constructing an object category density map. Motivated by psychological studies, we further reduce image-level supervision using a limited object count information (up to four). To the best of our knowledge, we are the first to propose image-level supervised density map estimation for common object counting and demonstrate its effectiveness in image-level supervised instance segmentation. Comprehensive experiments are performed on the PASCAL VOC and COCO datasets. Our approach outperforms existing methods, including those using instance-level supervision, on both datasets for common object counting. Moreover, our approach improves state-of-the-art image-level supervised instance segmentation [34] with a relative gain of 17.8% in terms of average best overlap, on the PASCAL VOC 2012 dataset.
  •  
19.
  • Danelljan, Martin, et al. (författare)
  • A Low-Level Active Vision Framework for Collaborative Unmanned Aircraft Systems
  • 2015
  • Ingår i: COMPUTER VISION - ECCV 2014 WORKSHOPS, PT I. - Cham : Springer Publishing Company. - 9783319161778 - 9783319161785 ; , s. 223-237
  • Konferensbidrag (refereegranskat)abstract
    • Micro unmanned aerial vehicles are becoming increasingly interesting for aiding and collaborating with human agents in myriads of applications, but in particular they are useful for monitoring inaccessible or dangerous areas. In order to interact with and monitor humans, these systems need robust and real-time computer vision subsystems that allow to detect and follow persons.In this work, we propose a low-level active vision framework to accomplish these challenging tasks. Based on the LinkQuad platform, we present a system study that implements the detection and tracking of people under fully autonomous flight conditions, keeping the vehicle within a certain distance of a person. The framework integrates state-of-the-art methods from visual detection and tracking, Bayesian filtering, and AI-based control. The results from our experiments clearly suggest that the proposed framework performs real-time detection and tracking of persons in complex scenarios
  •  
20.
  • Danelljan, Martin, 1989-, et al. (författare)
  • A Probabilistic Framework for Color-Based Point Set Registration
  • 2016
  • Ingår i: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). - : Institute of Electrical and Electronics Engineers (IEEE). - 9781467388511 - 9781467388528 ; , s. 1818-1826
  • Konferensbidrag (refereegranskat)abstract
    • In recent years, sensors capable of measuring both color and depth information have become increasingly popular. Despite the abundance of colored point set data, state-of-the-art probabilistic registration techniques ignore the available color information. In this paper, we propose a probabilistic point set registration framework that exploits available color information associated with the points. Our method is based on a model of the joint distribution of 3D-point observations and their color information. The proposed model captures discriminative color information, while being computationally efficient. We derive an EM algorithm for jointly estimating the model parameters and the relative transformations. Comprehensive experiments are performed on the Stanford Lounge dataset, captured by an RGB-D camera, and two point sets captured by a Lidar sensor. Our results demonstrate a significant gain in robustness and accuracy when incorporating color information. On the Stanford Lounge dataset, our approach achieves a relative reduction of the failure rate by 78% compared to the baseline. Furthermore, our proposed model outperforms standard strategies for combining color and 3D-point information, leading to state-of-the-art results.
  •  
21.
  • Danelljan, Martin, et al. (författare)
  • Adaptive Color Attributes for Real-Time Visual Tracking
  • 2014
  • Ingår i: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2014. - : IEEE Computer Society. - 9781479951178 ; , s. 1090-1097
  • Konferensbidrag (refereegranskat)abstract
    • Visual tracking is a challenging problem in computer vision. Most state-of-the-art visual trackers either rely on luminance information or use simple color representations for image description. Contrary to visual tracking, for object recognition and detection, sophisticated color features when combined with luminance have shown to provide excellent performance. Due to the complexity of the tracking problem, the desired color feature should be computationally efficient, and possess a certain amount of photometric invariance while maintaining high discriminative power.This paper investigates the contribution of color in a tracking-by-detection framework. Our results suggest that color attributes provides superior performance for visual tracking. We further propose an adaptive low-dimensional variant of color attributes. Both quantitative and attributebased evaluations are performed on 41 challenging benchmark color sequences. The proposed approach improves the baseline intensity-based tracker by 24% in median distance precision. Furthermore, we show that our approach outperforms state-of-the-art tracking methods while running at more than 100 frames per second.
  •  
22.
  • Danelljan, Martin, 1989-, et al. (författare)
  • Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking
  • 2016
  • Ingår i: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). - : Institute of Electrical and Electronics Engineers (IEEE). - 9781467388511 - 9781467388528 ; , s. 1430-1438
  • Konferensbidrag (refereegranskat)abstract
    • Tracking-by-detection methods have demonstrated competitive performance in recent years. In these approaches, the tracking model heavily relies on the quality of the training set. Due to the limited amount of labeled training data, additional samples need to be extracted and labeled by the tracker itself. This often leads to the inclusion of corrupted training samples, due to occlusions, misalignments and other perturbations. Existing tracking-by-detection methods either ignore this problem, or employ a separate component for managing the training set. We propose a novel generic approach for alleviating the problem of corrupted training samples in tracking-by-detection frameworks. Our approach dynamically manages the training set by estimating the quality of the samples. Contrary to existing approaches, we propose a unified formulation by minimizing a single loss over both the target appearance model and the sample quality weights. The joint formulation enables corrupted samples to be down-weighted while increasing the impact of correct ones. Experiments are performed on three benchmarks: OTB-2015 with 100 videos, VOT-2015 with 60 videos, and Temple-Color with 128 videos. On the OTB-2015, our unified formulation significantly improves the baseline, with a gain of 3.8% in mean overlap precision. Finally, our method achieves state-of-the-art results on all three datasets.
  •  
23.
  • Danelljan, Martin, 1989-, et al. (författare)
  • Aligning the Dissimilar: A Probabilistic Feature-Based Point Set Registration Approach
  • 2016
  • Ingår i: Proceedings of the 23rd International Conference on Pattern Recognition (ICPR) 2016. - : IEEE. - 9781509048472 - 9781509048489 ; , s. 247-252
  • Konferensbidrag (refereegranskat)abstract
    • 3D-point set registration is an active area of research in computer vision. In recent years, probabilistic registration approaches have demonstrated superior performance for many challenging applications. Generally, these probabilistic approaches rely on the spatial distribution of the 3D-points, and only recently color information has been integrated into such a framework, significantly improving registration accuracy. Other than local color information, high-dimensional 3D shape features have been successfully employed in many applications such as action recognition and 3D object recognition. In this paper, we propose a probabilistic framework to integrate high-dimensional 3D shape features with color information for point set registration. The 3D shape features are distinctive and provide complementary information beneficial for robust registration. We validate our proposed framework by performing comprehensive experiments on the challenging Stanford Lounge dataset, acquired by a RGB-D sensor, and an outdoor dataset captured by a Lidar sensor. The results clearly demonstrate that our approach provides superior results both in terms of robustness and accuracy compared to state-of-the-art probabilistic methods.
  •  
24.
  • Danelljan, Martin, 1989-, et al. (författare)
  • ATOM: Accurate tracking by overlap maximization
  • 2019
  • Ingår i: 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019). - : IEEE. - 9781728132938 ; , s. 4655-4664
  • Konferensbidrag (refereegranskat)abstract
    • While recent years have witnessed astonishing improvements in visual tracking robustness, the advancements in tracking accuracy have been limited. As the focus has been directed towards the development of powerful classifiers, the problem of accurate target state estimation has been largely overlooked. In fact, most trackers resort to a simple multi-scale search in order to estimate the target bounding box. We argue that this approach is fundamentally limited since target estimation is a complex task, requiring highlevel knowledge about the object. We address this problem by proposing a novel tracking architecture, consisting of dedicated target estimation and classification components. High level knowledge is incorporated into the target estimation through extensive offline learning. Our target estimation component is trained to predict the overlap between the target object and an estimated bounding box. By carefully integrating targetspecific information, our approach achieves previously unseen bounding box accuracy. We further introduce a classification component that is trained online to guarantee high discriminative power in the presence of distractors. Our final tracking framework sets a new state-of-the-art on five challenging benchmarks. On the new large-scale TrackingNet dataset, our tracker ATOM achieves a relative gain of 15% over the previous best approach, while running at over 30 FPS. Code and models are available at https://github.com/visionml/pytracking.
  •  
25.
  • Danelljan, Martin, 1989-, et al. (författare)
  • Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking
  • 2016
  • Ingår i: Computer Vision – ECCV 2016. - Cham : Springer. - 9783319464534 - 9783319464541 ; , s. 472-488
  • Konferensbidrag (refereegranskat)abstract
    • Discriminative Correlation Filters (DCF) have demonstrated excellent performance for visual object tracking. The key to their success is the ability to efficiently exploit available negative data by including all shifted versions of a training sample. However, the underlying DCF formulation is restricted to single-resolution feature maps, significantly limiting its potential. In this paper, we go beyond the conventional DCF framework and introduce a novel formulation for training continuous convolution filters. We employ an implicit interpolation model to pose the learning problem in the continuous spatial domain. Our proposed formulation enables efficient integration of multi-resolution deep feature maps, leading to superior results on three object tracking benchmarks: OTB-2015 (+5.1% in mean OP), Temple-Color (+4.6% in mean OP), and VOT2015 (20% relative reduction in failure rate). Additionally, our approach is capable of sub-pixel localization, crucial for the task of accurate feature point tracking. We also demonstrate the effectiveness of our learning formulation in extensive feature point tracking experiments.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-25 av 64
Typ av publikation
konferensbidrag (47)
tidskriftsartikel (9)
annan publikation (5)
doktorsavhandling (3)
Typ av innehåll
refereegranskat (54)
övrigt vetenskapligt/konstnärligt (10)
Författare/redaktör
Khan, Fahad Shahbaz, ... (45)
Felsberg, Michael, 1 ... (27)
Danelljan, Martin, 1 ... (21)
Felsberg, Michael (13)
van de Weijer, Joost (12)
Khan, Fahad Shahbaz (12)
visa fler...
Bhat, Goutam (11)
Danelljan, Martin (10)
Shao, Ling (8)
Häger, Gustav (7)
Khan, Salman (6)
Anwer, Rao Muhammad (6)
Matas, Jiri (6)
Eldesokey, Abdelrahm ... (6)
Leonardis, Ales (6)
Fernandez, Gustavo (6)
Johnander, Joakim (5)
Häger, Gustav, 1988- (5)
Kristan, Matej (5)
Pflugfelder, Roman (5)
Lukezic, Alan (5)
Cholakkal, Hisham (4)
Pang, Yanwei (4)
Vojır, Tomas (4)
Porikli, Fatih (4)
Bertinetto, Luca (4)
Golodetz, Stuart (4)
Järemo-Lawin, Felix (4)
Wang, Dong (3)
Khan, S (3)
Berg, Amanda, 1988- (3)
Li, Yang (3)
Torr, Philip H.S. (3)
Li, Bo (3)
Zhao, Fei (3)
Tang, Ming (3)
Robinson, Andreas, 1 ... (3)
Yang, Ming-Hsuan (3)
Bowden, Richard (3)
Cehovin, Luka (3)
Zhu, Jianke (3)
Wang, Jinqiao (3)
Martinez, Jose M. (3)
Wen, Longyin (3)
Miksik, Ondrej (3)
Martin-Nieto, Rafael (3)
Petrosino, Alfredo (3)
Possegger, Horst (3)
Hadfield, Simon (3)
Naseer, Muzammal (3)
visa färre...
Lärosäte
Linköpings universitet (62)
Göteborgs universitet (1)
Uppsala universitet (1)
Högskolan i Skövde (1)
Chalmers tekniska högskola (1)
Karolinska Institutet (1)
visa fler...
Högskolan Dalarna (1)
visa färre...
Språk
Engelska (64)
Forskningsämne (UKÄ/SCB)
Naturvetenskap (55)
Teknik (7)
Medicin och hälsovetenskap (1)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy