My research interests contain Computer Vision, Generative Models and Embodied Agents. In particular, I am interested in using AI-generated content for Embodied Planning and Content Creation.
I am actively seeking full-time opportunities starting in 2026.
Please feel free to reach out if you think my background could be a good fit.
We introduce ENVISION, a novel framework that generates physically plausible planning videos with precise instruction following, enabling direct execution on robotic systems.
CONDITION MATTERS IN FULL-HEAD 3D GANS
Heyuan Li, Huimin Zhang, Yuda Qiu, Zhengwentai Sun, Keru Zheng, Lingteng Qiu, Peihao Li, Qi Zuo, Ce Chen, Yujian Zheng, Yuming Gu, Zilong Dong, Xiaoguang Han
ICLR, 2026[PDF][Page] Area: Area: Novel View Syntehsis, 3D GAN, Head
we propose a novel view-invariant semantic feature as the conditioning input, thereby decoupling the generative capability of 3D heads from the viewing direction.
We introduce Diffportrait360, a novel approach generates fully consistent 360-degree head views, accommodating human, stylized, and anthropomorphic forms, including accessories like glasses.
We present DiffPortrait3D, a conditional diffusion model
that is capable of synthesizing 3D-consistent photo-realistic
novel views from as few as a single in-the-wild portrait.
We present a deep learning-based framework for portrait
reenactment from a single picture of a target (one-shot) and a video of a
driving subject.
Protecting World Leaders Against Deep Fakes
Shruti Agarwal, Hany Farid, Yuming Gu, Mingming He, Koki Nagano, Hao Li Computer Vision and Pattern Recognition (CVPR workshops), 2019 [PDF] Area: Image systhesis, Media Forensics
we describe a forensic technique that models facial expressions and movements that typify an individual's speaking pattern.