My research interests contain Computer Vision, Generative Models and Embodied Agents. In particular, I am interested in using AI-generated content for Embodied Planning and Content Creation.
I am actively seeking full-time opportunities starting in 2026.
Please feel free to reach out if you think my background could be a good fit.
We introduce ENVISION, a novel framework that generates physically plausible planning videos with precise instruction following, enabling direct execution on robotic systems.
We introduce Diffportrait360, a novel approach generates fully consistent 360-degree head views, accommodating human, stylized, and anthropomorphic forms, including accessories like glasses.
We present DiffPortrait3D, a conditional diffusion model
that is capable of synthesizing 3D-consistent photo-realistic
novel views from as few as a single in-the-wild portrait.
We present a deep learning-based framework for portrait
reenactment from a single picture of a target (one-shot) and a video of a
driving subject.
Protecting World Leaders Against Deep Fakes
Shruti Agarwal, Hany Farid, Yuming Gu, Mingming He, Koki Nagano, Hao Li Computer Vision and Pattern Recognition (CVPR workshops), 2019 [PDF] Area: Image systhesis, Media Forensics
we describe a forensic technique that models facial expressions and movements that typify an individual's speaking pattern.