SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Zamir Syed Waqas) "

Sökning: WFRF:(Zamir Syed Waqas)

  • Resultat 1-8 av 8
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Bhat, Goutam, et al. (författare)
  • NTIRE 2022 Burst Super-Resolution Challenge
  • 2022
  • Ingår i: 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2022). - : IEEE. - 9781665487399 - 9781665487405 ; , s. 1040-1060
  • Konferensbidrag (refereegranskat)abstract
    • Burst super-resolution has received increased attention in recent years due to its applications in mobile photography. By merging information from multiple shifted images of a scene, burst super-resolution aims to recover details which otherwise cannot be obtained using a simple input image. This paper reviews the NTIRE 2022 challenge on burst super-resolution. In the challenge, the participants were tasked with generating a clean RGB image with 4x higher resolution, given a RAW noisy burst as input. That is, the methods need to perform joint denoising, demosaicking, and super-resolution. The challenge consisted of 2 tracks. Track 1 employed synthetic data, where pixel-accurate high-resolution ground truths are available. Track 2 on the other hand used real-world bursts captured from a handheld camera, along with approximately aligned reference images captured using a DSLR. 14 teams participated in the final testing phase. The top performing methods establish a new state-of-the-art on the burst super-resolution task.
  •  
2.
  • Dudhane, Akshay, et al. (författare)
  • Burst Image Restoration and Enhancement
  • 2022
  • Ingår i: 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022). - : IEEE COMPUTER SOC. - 9781665469463 - 9781665469470 ; , s. 5749-5758
  • Konferensbidrag (refereegranskat)abstract
    • Modern handheld devices can acquire burst image sequence in a quick succession. However, the individual acquired frames suffer from multiple degradations and are misaligned due to camera shake and object motions. The goal of Burst Image Restoration is to effectively combine complimentary cues across multiple burst frames to generate high-quality outputs. Towards this goal, we develop a novel approach by solely focusing on the effective information exchange between burst frames, such that the degradations get filtered out while the actual scene details are preserved and enhanced. Our central idea is to create a set of pseudo-burst features that combine complimentary information from all the input burst frames to seamlessly exchange information. However, the pseudo-burst cannot be successfully created unless the individual burst frames are properly aligned to discount interframe movements. Therefore, our approach initially extracts pre-processed features from each burst frame and matches them using an edge-boosting burst alignment module. The pseudo-burst features are then created and enriched using multi-scale contextual information. Our final step is to adaptively aggregate information from the pseudo-burst features to progressively increase resolution in multiple stages while merging the pseudo-burst features. In comparison to existing works that usually follow a late fusion scheme with single-stage upsampling, our approach performs favorably, delivering state-of-the-art performance on burst super-resolution, burst low-light image enhancement and burst denoising tasks. The source code and pre-trained models are available at https://github.com/akshaydudhane16/BIPNet.
  •  
3.
  • Dudhane, Akshay, et al. (författare)
  • Burstormer: Burst Image Restoration and Enhancement Transformer
  • 2023
  • Ingår i: 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR. - : IEEE COMPUTER SOC. - 9798350301298 - 9798350301304 ; , s. 5703-5712
  • Konferensbidrag (refereegranskat)abstract
    • On a shutter press, modern handheld cameras capture multiple images in rapid succession and merge them to generate a single image. However, individual frames in a burst are misaligned due to inevitable motions and contain multiple degradations. The challenge is to properly align the successive image shots and merge their complimentary information to achieve high-quality outputs. Towards this direction, we propose Burstormer: a novel transformer-based architecture for burst image restoration and enhancement. In comparison to existing works, our approach exploits multi-scale local and non-local features to achieve improved alignment and feature fusion. Our key idea is to enable inter-frame communication in the burst neighborhoodsf or information aggregation and progressive fusion while modeling the burst-wide context. However, the input burst frames need to be properly aligned before fusing their information. Therefore, we propose an enhanced deformable alignment module for aligning burst features with regards to the reference frame. Unlike existing methods, the proposed alignment module not only aligns burst features but also exchanges feature information and maintains focused communication with the reference frame through the proposed reference-based feature enrichment mechanism, which facilitates handling complex motions. Aft er multi-level alignment and enrichment, we re-emphasize on inter-frame communication within burst using a cyclic burst sampling module. Finally, the inter-frame information is aggregated using the proposed burst feature fusion module followed by progressive upsampling. Our Burstormer outperforms state-of-the-art methods on burst super-resolution, burst denoising and burst low-light enhancement. Our codes and pre-trained models are available at https://github.com/akshaydudhane16/Burstormer.
  •  
4.
  • Khan, Salman, et al. (författare)
  • Transformers in Vision: A Survey
  • 2022
  • Ingår i: ACM Computing Surveys. - : ASSOC COMPUTING MACHINERY. - 0360-0300 .- 1557-7341. ; 54:10
  • Tidskriftsartikel (refereegranskat)abstract
    • Astounding results from Transformer models on natural language tasks have intrigued the vision community to study their application to computer vision problems. Among their salient benefits, Transformers enable modeling long dependencies between input sequence elements and support parallel processing of sequence as compared to recurrent networks, e.g., Long short-term memory. Different from convolutional networks, Transformers require minimal inductive biases for their design and are naturally suited as set-functions. Furthermore, the straightforward design of Transformers allows processing multiple modalities (e.g., images, videos, text, and speech) using similar processing blocks and demonstrates excellent scalability to very large capacity networks and huge datasets. These strengths have led to exciting progress on a number of vision tasks using Transformer networks. This survey aims to provide a comprehensive overview of the Transformer models in the computer vision discipline. We start with an introduction to fundamental concepts behind the success of Transformers, i.e., self-attention, large-scale pre-training, and bidirectional feature encoding. We then cover extensive applications of transformers in vision including popular recognition tasks (e.g., image classification, object detection, action recognition, and segmentation), generative modeling, multi-modal tasks (e.g., visual-question answering, visual reasoning, and visual grounding), video processing (e.g., activity recognition, video forecasting), low-level vision (e.g., image super-resolution, image enhancement, and colorization), and three-dimensional analysis (e.g., point cloud classification and segmentation). We compare the respective advantages and limitations of popular techniques both in terms of architectural design and their experimental value. Finally, we provide an analysis on open research directions and possible future works. We hope this effort will ignite further interest in the community to solve current challenges toward the application of transformer models in computer vision.
  •  
5.
  • Mehta, Nancy, et al. (författare)
  • Adaptive Feature Consolidation Network for Burst Super-Resolution
  • 2022
  • Ingår i: 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2022). - : IEEE. - 9781665487399 - 9781665487405 ; , s. 1278-1285
  • Konferensbidrag (refereegranskat)abstract
    • Modern digital cameras generally count on image signal processing (ISP) pipelines for producing naturalistic RGB images. Nevertheless, in comparison to DSLR cameras, low-quality images are generally output from portable mobile devices due to their physical limitations. The synthesized low-quality images usually have multiple degradations - low-resolution owing to small camera sensors, mosaic patterns on account of camera filter array and subpixel shifts due to camera motion. Such degradation usually restrain the performance of single image super-resolution methodologies for retrieving high-resolution (HR) image from a single low-resolution (LR) image. Burst image super-resolution aims at restoring a photo-realistic HR image by capturing the abundant information from multiple LR images. Lately, the soaring popularity of burst photography has made multi-frame processing an attractive solution for overcoming the limitations of single image processing. In our work, we thus aim to propose a generic architecture, adaptive feature consolidation network (AFCNet) for multi-frame processing. To alleviate the challenge of effectively modelling the long-range dependency problem, that multi-frame approaches struggle to solve, we utilize encoder-decoder based transformer backbone which learns multi-scale local-global representations. We propose feature alignment module to align LR burst frame features. Further, the aligned features are fused and reconstructed by abridged pseudo-burst fusion module and adaptive group upsampling modules, respectively. Our proposed approach clearly outperforms the other existing state-of-the-art techniques on benchmark datasets. The experimental results illustrate the effectiveness and generality of our proposed framework in upgrading the visual quality of HR images.
  •  
6.
  • Mehta, Nancy, et al. (författare)
  • Gated Multi-Resolution Transfer Network for Burst Restoration and Enhancement
  • 2023
  • Ingår i: 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR). - : IEEE COMPUTER SOC. - 9798350301298 - 9798350301304 ; , s. 22201-22210
  • Konferensbidrag (refereegranskat)abstract
    • Burst image processing is becoming increasingly popular in recent years. However, it is a challenging task since individual burst images undergo multiple degradations and often have mutual misalignments resulting in ghosting and zipper artifacts. Existing burst restoration methods usually do not consider the mutual correlation and non-local contextual information among burst frames, which tends to limit these approaches in challenging cases. Another key challenge lies in the robust up-sampling of burst frames. The existing up-sampling methods cannot effectively utilize the advantages of single-stage and progressive up-sampling strategies with conventional and/or recent up-samplers at the same time. To address these challenges, we propose a novel Gated Multi-Resolution Transfer Network (GMTNet) to reconstruct a spatially precise high-quality image from a burst of low-quality raw images. GMT-Net consists of three modules optimized for burst processing tasks: Multi-scale Burst Feature Alignment (MBFA) for feature denoising and alignment, Transposed-Attention Feature Merging (TAFM) for multi-frame feature aggregation, and Resolution Transfer Feature Up-sampler (RTFU) to up-scale merged features and construct a high-quality output image. Detailed experimental analysis on five datasets validate our approach and sets a state-of-the-art for burst super-resolution, burst denoising, and low-light burst enhancement. Our codes and models are available at https://github.com/nanmehta/GMTNet.
  •  
7.
  • Shamshad, Fahad, et al. (författare)
  • Transformers in medical imaging: A survey
  • 2023
  • Ingår i: Medical Image Analysis. - : ELSEVIER. - 1361-8415 .- 1361-8423. ; 88
  • Tidskriftsartikel (refereegranskat)abstract
    • Following unprecedented success on the natural language tasks, Transformers have been successfully applied to several computer vision problems, achieving state-of-the-art results and prompting researchers to reconsider the supremacy of convolutional neural networks (CNNs) as de facto operators. Capitalizing on these advances in computer vision, the medical imaging field has also witnessed growing interest for Transformers that can capture global context compared to CNNs with local receptive fields. Inspired from this transition, in this survey, we attempt to provide a comprehensive review of the applications of Transformers in medical imaging covering various aspects, ranging from recently proposed architectural designs to unsolved issues. Specifically, we survey the use of Transformers in medical image segmentation, detection, classification, restoration, synthesis, registration, clinical report generation, and other tasks. In particular, for each of these applications, we develop taxonomy, identify application-specific challenges as well as provide insights to solve them, and highlight recent trends. Further, we provide a critical discussion of the fields current state as a whole, including the identification of key challenges, open problems, and outlining promising future directions. We hope this survey will ignite further interest in the community and provide researchers with an up-to-date reference regarding applications of Transformer models in medical imaging. Finally, to cope with the rapid development in this field, we intend to regularly update the relevant latest papers and their open-source implementations at https://github.com/fahadshamshad/awesome-transformers-in-medical-imaging.
  •  
8.
  • Zamir, Syed Waqas, et al. (författare)
  • Restormer: Efficient Transformer for High-Resolution Image Restoration
  • 2022
  • Ingår i: 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022). - : IEEE COMPUTER SOC. - 9781665469463 - 9781665469470 ; , s. 5718-5729
  • Konferensbidrag (refereegranskat)abstract
    • Since convolutional neural networks (CNNs) perform well at learning generalizable image priors from largescale data, these models have been extensively applied to image restoration and related tasks. Recently, another class of neural architectures, Transformers, have shown significant performance gains on natural language and high-level vision tasks. While the Transformer model mitigates the shortcomings of CNNs (i.e., limited receptive field and inadaptability to input content), its computational complexity grows quadratically with the spatial resolution, therefore making it infeasible to apply to most image restoration tasks involving high-resolution images. In this work, we propose an efficient Transformer model by making several key designs in the building blocks (multi-head attention and feed-forward network) such that it can capture long-range pixel interactions, while still remaining applicable to large images. Our model, named Restoration Transformer (Restormer), achieves state-of-the-art results on several image restoration tasks, including image deraining, single-image motion deblurring, defocus deblurring (single-image and dual-pixel data), and image denoising (Gaussian grayscale/color denoising, and real image denoising). The source code and pre-trained models are available at https://github.com/swz30/Restormer.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-8 av 8

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy