SwePub
Sök i LIBRIS databas

  Utökad sökning

id:"swepub:oai:DiVA.org:liu-199148"
 

Sökning: id:"swepub:oai:DiVA.org:liu-199148" > Person Image Synthe...

Person Image Synthesis via Denoising Diffusion Model

Bhunia, Ankan Kumar (författare)
Mohamed bin Zayed Univ AI, U Arab Emirates
Khan, Salman (författare)
Mohamed bin Zayed Univ AI, U Arab Emirates; Australian Natl Univ, Australia
Cholakkal, Hisham (författare)
Mohamed bin Zayed Univ AI, U Arab Emirates
visa fler...
Anwer, Rao Muhammad (författare)
Mohamed bin Zayed Univ AI, U Arab Emirates; Aalto Univ, Finland
Laaksonen, Jorma (författare)
Aalto Univ, Finland
Shah, Mubarak (författare)
Univ Cent Florida, FL USA
Khan, Fahad (författare)
Linköpings universitet,Datorseende,Tekniska fakulteten,Mohamed bin Zayed Univ AI, U Arab Emirates
visa färre...
 (creator_code:org_t)
IEEE COMPUTER SOC, 2023
2023
Engelska.
Ingår i: 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR. - : IEEE COMPUTER SOC. - 9798350301298 - 9798350301304 ; , s. 5968-5976
  • Konferensbidrag (refereegranskat)
Abstract Ämnesord
Stäng  
  • The pose-guided person image generation task requires synthesizing photorealistic images of humans in arbitrary poses. The existing approaches use generative adversarial networks that do not necessarily maintain realistic textures or need dense correspondences that struggle to handle complex deformations and severe occlusions. In this work, we show how denoising diffusion models can be applied for high-fidelity person image synthesis with strong sample diversity and enhanced mode coverage of the learnt data distribution. Our proposed Person Image Diffusion Model (PIDM) disintegrates the complex transfer problem into a series of simpler forward-backward denoising steps. This helps in learning plausible source-to-target transformation trajectories that result in faithful textures and undistorted appearance details. We introduce a texture diffusion module based on cross-attention to accurately model the correspondences between appearance and pose information available in source and target images. Further, we propose disentangled classifier-free guidance to ensure close resemblance between the conditional inputs and the synthesized output in terms of both pose and appearance information. Our extensive results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios. We also show how our generated images can help in downstream tasks. Code is available at https://github.com/ankanbhunia/PIDM.

Ämnesord

NATURVETENSKAP  -- Data- och informationsvetenskap -- Datorseende och robotik (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Vision and Robotics (hsv//eng)

Publikations- och innehållstyp

ref (ämneskategori)
kon (ämneskategori)

Hitta via bibliotek

Till lärosätets databas

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy