Abstract
Automatic 3D facial texture generation has gained significant interest recently. Existing approaches may not support the traditional physically based rendering pipeline or rely on 3D data captured by Light Stage. Our key contribution is a progressive latent space refinement approach that can bootstrap from 3D Morphable Models (3DMMs)-based texture maps generated from facial images to generate high-quality and diverse PBR textures, including albedo, normal, and roughness. It starts with enhancing Generative Adversarial Networks (GANs) for text-guided and diverse texture generation. To this end, we design a self-supervised paradigm to overcome the reliance on ground truth 3D textures and train the generative model with only entangled texture maps. Besides, we foster mutual enhancement between GANs and Score Distillation Sampling (SDS). SDS boosts GANs with more generative modes, while GANs promote more efficient optimization of SDS. Furthermore, we introduce an edge-aware SDS for multi-view consistent facial structure. Experiments demonstrate that our method outperforms existing 3D texture generation methods regarding photo-realistic quality, diversity, and efficiency.
Abstract (translated)
自动3D面部纹理生成最近受到了广泛关注。现有的方法可能不支持传统的基于物理渲染管道,或者依赖于由光 stages捕获的3D数据。我们关键的贡献是一种渐进式的潜在空间细化方法,可以从基于面部图像生成的3D可塑模型(3DMM)纹理贴图开始,生成高质量和多样性的PBR纹理,包括Albedo、法和粗糙度。它从增强引导生成对抗网络(GANs)用于文本指导和大胆纹理生成开始。为此,我们设计了一个自监督的范式,以克服对真实3D纹理的依赖,并仅使用纠缠纹理贴图训练生成模型。此外,我们促进了GANs和评分蒸馏采样(SDS)之间的相互增强。SDS通过增加生成模式来提升GANs,而GANs通过更有效地优化SDS来推动SDS。此外,我们还引入了一个边缘感知的SDS,用于多视角一致的面部结构。实验证明,我们的方法在照片写实质量、多样性和效率方面超过了现有的3D纹理生成方法。
URL
https://arxiv.org/abs/2404.09540