SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Khan Fahad Shahbaz) srt2:(2020-2023)"

Sökning: WFRF:(Khan Fahad Shahbaz) > (2020-2023)

  • Resultat 1-10 av 20
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Bhunia, Ankan Kumar, et al. (författare)
  • Handwriting Transformers
  • 2021
  • Ingår i: 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021). - : IEEE. - 9781665428125 - 9781665428132 ; , s. 1066-1074
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • We propose a novel transformer-based styled handwritten text image generation approach, HWT, that strives to learn both style-content entanglement as well as global and local writing style patterns. The proposed HWT captures the long and short range relationships within the style examples through a self-attention mechanism, thereby encoding both global and local style patterns. Further, the proposed transformer-based HWT comprises an encoder-decoder attention that enables style-content entanglement by gathering the style representation of each query character. To the best of our knowledge, we are the first to introduce a transformer-based generative network for styled handwritten text generation. Our proposed HWT generates realistic styled handwritten text images and significantly outperforms the state-of-the-art demonstrated through extensive qualitative, quantitative and human-based evaluations. The proposed HWT can handle arbitrary length of text and any desired writing style in a few-shot setting. Further, our HWT generalizes well to the challenging scenario where both words and writing style are unseen during training, generating realistic styled handwritten text images.
  •  
2.
  • Joseph, KJ, et al. (författare)
  • Towards Open World Object Detection
  • 2021
  • Ingår i: 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021. - : IEEE COMPUTER SOC. - 9781665445092 ; , s. 5826-5836
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • Humans have a natural instinct to identify unknown object instances in their environments. The intrinsic curiosityabout these unknown instances aids in learning about them,when the corresponding knowledge is eventually available.This motivates us to propose a novel computer vision problem called: ‘Open World Object Detection’, where a modelis tasked to: 1) identify objects that have not been introduced to it as ‘unknown’, without explicit supervision to doso, and 2) incrementally learn these identified unknown categories without forgetting previously learned classes, whenthe corresponding labels are progressively received. Weformulate the problem, introduce a strong evaluation protocol and provide a novel solution, which we call ORE:Open World Object Detector, based on contrastive clustering and energy based unknown identification. Our experimental evaluation and ablation studies analyse the efficacyof ORE in achieving Open World objectives. As an interesting by-product, we find that identifying and characterisingunknown instances helps to reduce confusion in an incremental object detection setting, where we achieve state-ofthe-art performance, with no extra methodological effort.We hope that our work will attract further research into thisnewly identified, yet crucial research direction.
  •  
3.
  • Narayan, Sanath, et al. (författare)
  • Discriminative Region-based Multi-Label Zero-Shot Learning
  • 2021
  • Ingår i: 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021). - : IEEE. - 9781665428125 ; , s. 8711-8720
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • Multi-label zero-shot learning (ZSL) is a more realistic counter-part of standard single-label ZSL since several objects can co-exist in a natural image. However, the occurrence of multiple objects complicates the reasoning and requires region-specific processing of visual features to preserve their contextual cues. We note that the best existing multi-label ZSL method takes a shared approach towards attending to region features with a common set of attention maps for all the classes. Such shared maps lead to diffused attention, which does not discriminatively focus on relevant locations when the number of classes are large. Moreover, mapping spatially-pooled visual features to the class semantics leads to inter-class feature entanglement, thus hampering the classification. Here, we propose an alternate approach towards region-based discriminability-preserving multi-label zero-shot classification. Our approach maintains the spatial resolution to preserve region-level characteristics and utilizes a bi-level attention module (BiAM) to enrich the features by incorporating both region and scene context information. The enriched region-level features are then mapped to the class semantics and only their class predictions are spatially pooled to obtain image-level predictions, thereby keeping the multi-class features disentangled. Our approach sets a new state of the art on two large-scale multi-label zero-shot benchmarks: NUS-WIDE and Open Images. On NUS-WIDE, our approach achieves an absolute gain of 6.9% mAP for ZSL, compared to the best published results.
  •  
4.
  • Naseer, M., et al. (författare)
  • A Self-supervised Approach for Adversarial Robustness
  • 2020
  • Ingår i: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). - : IEEE. - 9781728171685 ; , s. 259-268
  • Konferensbidrag (refereegranskat)abstract
    • Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems e.g., for classification, segmentation and object detection. The vulnerability of DNNs against such attacks can prove a major roadblock towards their real-world deployment. Transferability of adversarial examples demand generalizable defenses that can provide cross-task protection. Adversarial training that enhances robustness by modifying target model’s parameters lacks such generalizability. On the other hand, different input processing based defenses fall short in the face of continuously evolving attacks. In this paper, we take the first step to combine the benefits of both approaches and propose a self-supervised adversarial training mechanism in the input space. By design, our defense is a generalizable approach and provides significant robustness against the unseen adversarial attacks (\eg by reducing the success rate of translation-invariant ensemble attack from 82.6% to 31.9% in comparison to previous state-of-the-art). It can be deployed as a plug-and-play solution to protect a variety of vision systems, as we demonstrate for the case of classification, segmentation and detection.
  •  
5.
  • Naseer, Muzammal, et al. (författare)
  • On Generating Transferable Targeted Perturbations
  • 2021
  • Ingår i: 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021). - : IEEE. - 9781665428125 - 9781665428132 ; , s. 7688-7697
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • While the untargeted black-box transferability of adversarial perturbations has been extensively studied before, changing an unseen model's decisions to a specific `targeted' class remains a challenging feat. In this paper, we propose a new generative approach for highly transferable targeted perturbations (\ours). We note that the existing methods are less suitable for this task due to their reliance on class-boundary information that changes from one model to another, thus reducing transferability. In contrast, our approach matches the perturbed image `distribution' with that of the target class, leading to high targeted transferability rates. To this end, we propose a new objective function that not only aligns the global distributions of source and target images, but also matches the local neighbourhood structure between the two domains. Based on the proposed objective, we train a generator function that can adaptively synthesize perturbations specific to a given input. Our generative approach is independent of the source or target domain labels, while consistently performs well against state-of-the-art methods on a wide range of attack settings. As an example, we achieve 32.63% target transferability from (an adversarially weak) VGG19BN to (a strong) WideResNet on ImageNet val. set, which is 4× higher than the previous best generative attack and 16× better than instance-specific iterative attack. 
  •  
6.
  •  
7.
  • Rajasegaran, J., et al. (författare)
  • iTAML : An Incremental Task-Agnostic Meta-learning Approach
  • 2020
  • Ingår i: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). - : IEEE. - 9781728171685 ; , s. 13585-13594
  • Konferensbidrag (refereegranskat)abstract
    • Humans can continuously learn new knowledge as their experience grows. In contrast, previous learning in deep neural networks can quickly fade out when they are trained on a new task. In this paper, we hypothesize this problem can be avoided by learning a set of generalized parameters, that are neither specific to old nor new tasks. In this pursuit, we introduce a novel meta-learning approach that seeks to maintain an equilibrium between all the encountered tasks. This is ensured by a new meta-update rule which avoids catastrophic forgetting. In comparison to previous meta-learning techniques, our approach is task-agnostic. When presented with a continuum of data, our model automatically identifies the task and quickly adapts to it with just a single update. We perform extensive experiments on five datasets in a class-incremental setting, leading to significant improvements over the state of the art methods (e.g., a 21.3% boost on CIFAR100 with 10 incremental tasks). Specifically, on large-scale datasets that generally prove difficult cases for incremental learning, our approach delivers absolute gains as high as 19.1% and 7.4% on ImageNet and MS-Celeb datasets, respectively.
  •  
8.
  • Ranasinghe, Kanchana, et al. (författare)
  • Orthogonal Projection Loss
  • 2021
  • Ingår i: 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021). - : IEEE. - 9781665428125 ; , s. 12313-12323
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • Deep neural networks have achieved remarkable performance on a range of classification tasks, with softmax cross-entropy (CE) loss emerging as the de-facto objective function. The CE loss encourages features of a class to have a higher projection score on the true class-vector compared to the negative classes. However, this is a relative constraint and does not explicitly force different class features to be well-separated. Motivated by the observation that ground-truth class representations in CE loss are orthogonal (one-hot encoded vectors), we develop a novel loss function termed `Orthogonal Projection Loss' (OPL) which imposes orthogonality in the feature space. OPL augments the properties of CE loss and directly enforces inter-class separation alongside intra-class clustering in the feature space through orthogonality constraints on the mini-batch level. As compared to other alternatives of CE, OPL offers unique advantages e.g., no additional learnable parameters, does not require careful negative mining and is not sensitive to the batch size. Given the plug-and-play nature of OPL, we evaluate it on a diverse range of tasks including image recognition (CIFAR-100), large-scale classification (ImageNet), domain generalization (PACS) and few-shot learning (miniImageNet, CIFAR-FS, tiered-ImageNet and Meta-dataset) and demonstrate its effectiveness across the board. Furthermore, OPL offers better robustness against practical nuisances such as adversarial attacks and label noise. 
  •  
9.
  • Wang, Y., et al. (författare)
  • Semi-Supervised Learning for Few-Shot Image-to-Image Translation
  • 2020
  • Ingår i: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). - : IEEE. - 9781728171685 ; , s. 4452-4461
  • Konferensbidrag (refereegranskat)abstract
    • In the last few years, unpaired image-to-image translation has witnessed Remarkable progress. Although the latest methods are able to generate realistic images, they crucially rely on a large number of labeled images. Recently, some methods have tackled the challenging setting of few-shot image-to-image ranslation, reducing the labeled data requirements for the target domain during inference. In this work, we go one step further and reduce the amount of required labeled data also from the source domain during training. To do so, we propose applying semi-supervised learning via a noise-tolerant pseudo-labeling procedure. We also apply a cycle consistency constraint to further exploit the information from unlabeled images, either from the same dataset or external. Additionally, we propose several structural modifications to facilitate the image translation task under these circumstances. Our semi-supervised method for few-shot image translation, called SEMIT, achieves excellent results on four different datasets using as little as 10% of the source labels, and matches the performance of the main fully-supervised competitor using only 20% labeled data. Our code and models are made public at: https://github.com/yaxingwang/SEMIT.
  •  
10.
  • Abbafati, Cristiana, et al. (författare)
  • 2020
  • Tidskriftsartikel (refereegranskat)
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 20

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy