DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis

CVPR 2025
USC1, MBZUAI2, ByteDance Inc.3 CUHK, Shenzhen4 Pinscreen Inc.5

given an unposed portrait image, Our Diffportrait360 enables 360-degree view-consistent full-head image synthesis without any fine-tuning and it is universally effective across a diverse range of facial portrait.

Abstract

We introduce Diffportrait360, a novel approach that generates fully consistent 360-degree head views, accommodating human, stylized, and anthropomorphic forms, including accessories like glasses and hats. Our method builds on the DiffPortrait3D framework, incorporating a custom ControlNet for back-of-head detail generation and a dual appearance module to ensure global front-back consistency. By training on continuous view sequences and integrating a back reference image, our approach achieves robust, locally continuous view synthesis. Our model can be used to produce high-quality neural radiance fields (NeRFs) for real-time, free-viewpoint rendering, outperforming state-of-the-art methods in object synthesis and 360-degree head generation for very challenging input portraits.

Architecture

MY ALT TEXT

For the task of full-range 360-degree novel view synthesis, DiffPortrait360 employs a frozen pre-trained Latent Diffusion Model (LDM) as a rendering backbone and incorporates three auxiliary trainable modules for disentangled control of dual appearance R, camera control C, and U-Nets with view consistency V. Specifically, R extracts appearance information from I_ref and I_back, and C derives the camera pose, which is rendered using an off-the-shelf 3D GAN. During training, we utilize a continuous sampling training strategy to better preserve the continuity of the camera trajectory. We enhance attention to continuity between frames to maintain the appearance information without changes due to turning angles. For inference, we employ our tailored back-view image generation network F to generate a back-view image, enabling us to generate a 360-degree full range of camera trajectories using a single image portrait.

Results

Comparsion to Previous Works

Ablation on Dual Appearance Module

Ablation on View Consistency

Acknowledgement

This work is supported by the Metaverse Center Grant from the MBZUAI Research Office. We thank Egor Zakharov, Zhenhui Lin, Maksat Kengeskanov, and Yiming Chen for the early discussions, helpful suggestions, and feedback.

BibTeX


      @misc{gu2025diffportrait360consistentportraitdiffusion,
        title={DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis}, 
        author={Yuming Gu and Phong Tran and Yujian Zheng and Hongyi Xu and Heyuan Li and Adilbek Karmanov and Hao Li},
        year={2025},
        eprint={2503.15667},
        archivePrefix={arXiv},
        primaryClass={cs.CV},
        url={https://arxiv.org/abs/2503.15667}, 
  }

This website is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This page is borrowed from source code. We thank the author for sharing it.