X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention

Abstract
Abstract (translated)
URL
PDF

Abstract

We propose X-Portrait, an innovative conditional diffusion model tailored for generating expressive and temporally coherent portrait animation. Specifically, given a single portrait as appearance reference, we aim to animate it with motion derived from a driving video, capturing both highly dynamic and subtle facial expressions along with wide-range head movements. As its core, we leverage the generative prior of a pre-trained diffusion model as the rendering backbone, while achieve fine-grained head pose and expression control with novel controlling signals within the framework of ControlNet. In contrast to conventional coarse explicit controls such as facial landmarks, our motion control module is learned to interpret the dynamics directly from the original driving RGB inputs. The motion accuracy is further enhanced with a patch-based local control module that effectively enhance the motion attention to small-scale nuances like eyeball positions. Notably, to mitigate the identity leakage from the driving signals, we train our motion control modules with scaling-augmented cross-identity images, ensuring maximized disentanglement from the appearance reference modules. Experimental results demonstrate the universal effectiveness of X-Portrait across a diverse range of facial portraits and expressive driving sequences, and showcase its proficiency in generating captivating portrait animations with consistently maintained identity characteristics.

Abstract (translated)

我们提出了X-Portrait,一种针对生成具有表现力和时间一致性的肖像动画的创新条件扩散模型。具体来说,给定一个单张肖像作为 appearance 参考,我们旨在通过来自驱动视频的运动来动画它,捕捉高动态度和微妙面部表情,并实现广泛的头部运动。其核心在于,我们利用预训练扩散模型的生成先验作为渲染骨架,同时通过 ControlNet 中的新控制信号实现细粒度头部姿势和表情控制。与传统的粗显控制方法(如面部特征)相比,我们的运动控制模块是在原始驱动 RGB 输入的框架内学习的,可以直接从原始驱动信号中解释动态。通过基于补丁的控制模块,可以进一步增强对小规模微妙的运动关注,比如眼睛位置。值得注意的是,为了减轻来自驱动信号的身份泄漏,我们通过缩放增强交叉熵图像来训练我们的运动控制模块,确保从表现参考模块的最大分离。实验结果表明,X-Portrait 在各种面部肖像和表现驱动序列中具有普遍的有效性,并展示了其在生成具有保持一致身份特性的引人入胜肖像动画方面的卓越能力。

URL

https://arxiv.org/abs/2403.15931

PDF

https://arxiv.org/pdf/2403.15931.pdf

X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention

Abstract

Abstract (translated)

URL

PDF Copy

PDF