HiFECap: Monocular High-Fidelity and Expressive Capture of Human Performances

2022-10-11 17:57:45

Yue Jiang, Marc Habermann, Vladislav Golyanik, Christian Theobalt

arXiv_AI

arXiv_AI Tracking Face Gesture Quantitative Pose 3D

Abstract
Abstract (translated)
URL
PDF

Abstract

Monocular 3D human performance capture is indispensable for many applications in computer graphics and vision for enabling immersive experiences. However, detailed capture of humans requires tracking of multiple aspects, including the skeletal pose, the dynamic surface, which includes clothing, hand gestures as well as facial expressions. No existing monocular method allows joint tracking of all these components. To this end, we propose HiFECap, a new neural human performance capture approach, which simultaneously captures human pose, clothing, facial expression, and hands just from a single RGB video. We demonstrate that our proposed network architecture, the carefully designed training strategy, and the tight integration of parametric face and hand models to a template mesh enable the capture of all these individual aspects. Importantly, our method also captures high-frequency details, such as deforming wrinkles on the clothes, better than the previous works. Furthermore, we show that HiFECap outperforms the state-of-the-art human performance capture approaches qualitatively and quantitatively while for the first time capturing all aspects of the human.

Abstract (translated)

URL

https://arxiv.org/abs/2210.05665

PDF

https://arxiv.org/pdf/2210.05665.pdf