SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "L773:9781728171685 "

Sökning: L773:9781728171685

  • Resultat 1-9 av 9
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Eldesokey, Abdelrahman, et al. (författare)
  • Uncertainty-Aware CNNs for Depth Completion : Uncertainty from Beginning to End
  • 2020
  • Ingår i: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). - : IEEE. - 9781728171685 - 9781728171692 ; , s. 12011-12020
  • Konferensbidrag (refereegranskat)abstract
    • The focus in deep learning research has been mostly to push the limits of prediction accuracy. However, this was often achieved at the cost of increased complexity, raising concerns about the interpretability and the reliability of deep networks. Recently, an increasing attention has been given to untangling the complexity of deep networks and quantifying their uncertainty for different computer vision tasks. Differently, the task of depth completion has not received enough attention despite the inherent noisy nature of depth sensors. In this work, we thus focus on modeling the uncertainty of depth data in depth completion starting from the sparse noisy input all the way to the final prediction. We propose a novel approach to identify disturbed measurements in the input by learning an input confidence estimator in a self-supervised manner based on the normalized convolutional neural networks (NCNNs). Further, we propose a probabilistic version of NCNNs that produces a statistically meaningful uncertainty measure for the final prediction. When we evaluate our approach on the KITTI dataset for depth completion, we outperform all the existing Bayesian Deep Learning approaches in terms of prediction accuracy, quality of the uncertainty measure, and the computational efficiency. Moreover, our small network with 670k parameters performs on-par with conventional approaches with millions of parameters. These results give strong evidence that separating the network into parallel uncertainty and prediction streams leads to state-of-the-art performance with accurate uncertainty estimates.
  •  
2.
  • Fieraru, Mihai, et al. (författare)
  • Three-dimensional reconstruction of human interactions
  • 2020
  • Ingår i: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). - 1063-6919. - 9781728171685 ; , s. 7212-7221
  • Konferensbidrag (refereegranskat)abstract
    • Understanding 3d human interactions is fundamental for fine grained scene analysis and behavioural modeling. However, most of the existing models focus on analyzing a single person in isolation, and those who process several people focus largely on resolving multi-person data association, rather than inferring interactions. This may lead to incorrect, lifeless 3d estimates, that miss the subtle human contact aspects–the essence of the event–and are of little use for detailed behavioral understanding. This paper addresses such issues and makes several contributions: (1) we introduce models for interaction signature estimation (ISP) encompassing contact detection, segmentation, and 3d contact signature prediction; (2) we show how such components can be leveraged in order to produce augmented losses that ensure contact consistency during 3d reconstruction; (3) we construct several large datasets for learning and evaluating 3d contact prediction and reconstruction methods; specifically, we introduce CHI3D, a lab-based accurate 3d motion capture dataset with 631 sequences containing 2, 525 contact events, 728, 664 ground truth 3d poses, as well as FlickrCI3D, a dataset of 11, 216 images, with 14, 081 processed pairs of people, and 81, 233 facet-level surface correspondences within 138, 213 selected contact regions. Finally, (4) we present models and baselines to illustrate how contact estimation supports meaningful 3d reconstruction where essential interactions are captured. Models and data are made available for research purposes at http://vision.imar.ro/ci3d.
  •  
3.
  • Naseer, M., et al. (författare)
  • A Self-supervised Approach for Adversarial Robustness
  • 2020
  • Ingår i: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). - : IEEE. - 9781728171685 ; , s. 259-268
  • Konferensbidrag (refereegranskat)abstract
    • Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems e.g., for classification, segmentation and object detection. The vulnerability of DNNs against such attacks can prove a major roadblock towards their real-world deployment. Transferability of adversarial examples demand generalizable defenses that can provide cross-task protection. Adversarial training that enhances robustness by modifying target model’s parameters lacks such generalizability. On the other hand, different input processing based defenses fall short in the face of continuously evolving attacks. In this paper, we take the first step to combine the benefits of both approaches and propose a self-supervised adversarial training mechanism in the input space. By design, our defense is a generalizable approach and provides significant robustness against the unseen adversarial attacks (\eg by reducing the success rate of translation-invariant ensemble attack from 82.6% to 31.9% in comparison to previous state-of-the-art). It can be deployed as a plug-and-play solution to protect a variety of vision systems, as we demonstrate for the case of classification, segmentation and detection.
  •  
4.
  • Rajasegaran, J., et al. (författare)
  • iTAML : An Incremental Task-Agnostic Meta-learning Approach
  • 2020
  • Ingår i: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). - : IEEE. - 9781728171685 ; , s. 13585-13594
  • Konferensbidrag (refereegranskat)abstract
    • Humans can continuously learn new knowledge as their experience grows. In contrast, previous learning in deep neural networks can quickly fade out when they are trained on a new task. In this paper, we hypothesize this problem can be avoided by learning a set of generalized parameters, that are neither specific to old nor new tasks. In this pursuit, we introduce a novel meta-learning approach that seeks to maintain an equilibrium between all the encountered tasks. This is ensured by a new meta-update rule which avoids catastrophic forgetting. In comparison to previous meta-learning techniques, our approach is task-agnostic. When presented with a continuum of data, our model automatically identifies the task and quickly adapts to it with just a single update. We perform extensive experiments on five datasets in a class-incremental setting, leading to significant improvements over the state of the art methods (e.g., a 21.3% boost on CIFAR100 with 10 incremental tasks). Specifically, on large-scale datasets that generally prove difficult cases for incremental learning, our approach delivers absolute gains as high as 19.1% and 7.4% on ImageNet and MS-Celeb datasets, respectively.
  •  
5.
  • Robinson, Andreas, 1975-, et al. (författare)
  • Learning Fast and Robust Target Models for Video Object Segmentation
  • 2020
  • Ingår i: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). - : IEEE. - 9781728171685 ; , s. 7404-7413
  • Konferensbidrag (refereegranskat)abstract
    • Video object segmentation (VOS) is a highly challenging problem since the initial mask, defining the target object, is only given at test-time. The main difficulty is to effectively handle appearance changes and similar background objects, while maintaining accurate segmentation. Most previous approaches fine-tune segmentation networks on the first frame, resulting in impractical frame-rates and risk of overfitting. More recent methods integrate generative target appearance models, but either achieve limited robustness or require large amounts of training data. We propose a novel VOS architecture consisting of two network components. The target appearance model consists of a light-weight module, which is learned during the inference stage using fast optimization techniques to predict a coarse but robust target segmentation. The segmentation model is exclusively trained offline, designed to process the coarse scores into high quality segmentation masks. Our method is fast, easily trainable and remains highly effective in cases of limited training data. We perform extensive experiments on the challenging YouTube-VOS and DAVIS datasets. Our network achieves favorable performance, while operating at higher frame-rates compared to state-of-the-art. Code and trained models are available at https://github.com/andr345/frtm-vos.
  •  
6.
  • Wang, T., et al. (författare)
  • Learning Human-Object Interaction Detection Using Interaction Points
  • 2020
  • Ingår i: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). - : IEEE. - 9781728171685 ; , s. 4115-4124
  • Konferensbidrag (refereegranskat)abstract
    • Understanding interactions between humans and objects is one of the fundamental problems in visual classification and an essential step towards detailed scene understanding. Human-object interaction (HOI) detection strives to localize both the human and an object as well as the identification of complex interactions between them. Most existing HOI detection approaches are instance-centric where interactions between all possible human-object pairs are predicted based on appearance features and coarse spatial information. We argue that appearance features alone are insufficient to capture complex human-object interactions. In this paper, we therefore propose a novel fully-convolutional approach that directly detects the interactions between human-object pairs. Our network predicts interaction points, which directly localize and classify the inter-action. Paired with the densely predicted interaction vectors, the interactions are associated with human and object detections to obtain final predictions. To the best of our knowledge, we are the first to propose an approach where HOI detection is posed as a keypoint detection and grouping problem. Experiments are performed on two popular benchmarks: V-COCO and HICO-DET. Our approach sets a new state-of-the-art on both datasets. Code is available at https://github.com/vaesl/IP-Net.
  •  
7.
  • Wang, Y., et al. (författare)
  • MineGAN : Effective Knowledge Transfer From GANs to Target Domains With Few Images
  • 2020
  • Ingår i: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). - : IEEE. - 9781728171685 ; , s. 9329-9338
  • Konferensbidrag (refereegranskat)abstract
    • One of the attractive characteristics of deep neural networks is their ability to transfer knowledge obtained in one domain to other related domains. As a result, high-quality networks can be trained in domains with relatively little training data. This property has been extensively studied for discriminative networks but has received significantly less attention for generative models. Given the often enormous effort required to train GANs, both computationally as well as in the dataset collection, the re-use of pretrained GANs is a desirable objective. We propose a novel knowledge transfer method for generative models based on mining the knowledge that is most beneficial to a specific target domain, either from a single or multiple pretrained GANs. This is done using a miner network that identifies which part of the generative distribution of each pretrained GAN outputs samples closest to the target domain. Mining effectively steers GAN sampling towards suitable regions of the latent space, which facilitates the posterior finetuning and avoids pathologies of other methods such as mode collapse and lack of flexibility. We perform experiments on several complex datasets using various GAN architectures (BigGAN, Progressive GAN) and show that the proposed method, called MineGAN, effectively transfers knowledge to domains with few target images, outperforming existing methods. In addition, MineGAN can successfully transfer knowledge from multiple pretrained GANs. Our code is available at: https://github.com/yaxingwang/MineGAN.
  •  
8.
  • Wang, Y., et al. (författare)
  • Semi-Supervised Learning for Few-Shot Image-to-Image Translation
  • 2020
  • Ingår i: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). - : IEEE. - 9781728171685 ; , s. 4452-4461
  • Konferensbidrag (refereegranskat)abstract
    • In the last few years, unpaired image-to-image translation has witnessed Remarkable progress. Although the latest methods are able to generate realistic images, they crucially rely on a large number of labeled images. Recently, some methods have tackled the challenging setting of few-shot image-to-image ranslation, reducing the labeled data requirements for the target domain during inference. In this work, we go one step further and reduce the amount of required labeled data also from the source domain during training. To do so, we propose applying semi-supervised learning via a noise-tolerant pseudo-labeling procedure. We also apply a cycle consistency constraint to further exploit the information from unlabeled images, either from the same dataset or external. Additionally, we propose several structural modifications to facilitate the image translation task under these circumstances. Our semi-supervised method for few-shot image translation, called SEMIT, achieves excellent results on four different datasets using as little as 10% of the source labels, and matches the performance of the main fully-supervised competitor using only 20% labeled data. Our code and models are made public at: https://github.com/yaxingwang/SEMIT.
  •  
9.
  • Örnhag, Marcus Valtonen, et al. (författare)
  • A Unified Optimization Framework for Low-Rank Inducing Penalties
  • 2020
  • Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. - 1063-6919. - 9781728171685 ; , s. 8471-8480
  • Konferensbidrag (refereegranskat)abstract
    • In this paper we study the convex envelopes of a new class of functions. Using this approach, we are able to unify two important classes of regularizers from unbiased nonconvex formulations and weighted nuclear norm penalties. This opens up for possibilities of combining the best of both worlds, and to leverage each method’s contribution to cases where simply enforcing one of the regularizers are insufficient. We show that the proposed regularizers can be incorporated in standard splitting schemes such as Alternating Direction Methods of Multipliers (ADMM), and other subgradient methods. Furthermore, we provide an efficient way of computing the proximal operator. Lastly, we show on real non-rigid structure-from-motion (NRSfM) datasets, the issues that arise from using weighted nuclear norm penalties, and how this can be remedied using our proposed method.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-9 av 9

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy