Paper Reading AI Learner

3D Face Modeling via Weakly-supervised Disentanglement Network joint Identity-consistency Prior

2024-04-25 11:50:47
Guohao Li, Hongyu Yang, Di Huang, Yunhong Wang
       

Abstract

Generative 3D face models featuring disentangled controlling factors hold immense potential for diverse applications in computer vision and computer graphics. However, previous 3D face modeling methods face a challenge as they demand specific labels to effectively disentangle these factors. This becomes particularly problematic when integrating multiple 3D face datasets to improve the generalization of the model. Addressing this issue, this paper introduces a Weakly-Supervised Disentanglement Framework, denoted as WSDF, to facilitate the training of controllable 3D face models without an overly stringent labeling requirement. Adhering to the paradigm of Variational Autoencoders (VAEs), the proposed model achieves disentanglement of identity and expression controlling factors through a two-branch encoder equipped with dedicated identity-consistency prior. It then faithfully re-entangles these factors via a tensor-based combination mechanism. Notably, the introduction of the Neutral Bank allows precise acquisition of subject-specific information using only identity labels, thereby averting degeneration due to insufficient supervision. Additionally, the framework incorporates a label-free second-order loss function for the expression factor to regulate deformation space and eliminate extraneous information, resulting in enhanced disentanglement. Extensive experiments have been conducted to substantiate the superior performance of WSDF. Our code is available at this https URL.

Abstract (translated)

生成式3D面部模型具有解耦的控制因素,在计算机视觉和计算机图形学中具有巨大的应用潜力。然而,之前的3D面部建模方法遇到了一个挑战,因为它们需要特定的标签来有效地解耦这些因素。当整合多个3D面部数据集来提高模型的泛化能力时,这个问题变得尤为严重。为了解决这个问题,本文引入了一个弱监督解耦框架(WSDF),以促进无需过于严格标签要求来训练可控制3D面部模型的训练。遵循变分自编码器(VAE)的范例,所提出的模型通过配备专用身份一致性先验的两个分支编码器实现对身份和表达控制因素的解耦。然后,它通过张量组合机制忠实地重新解耦这些因素。值得注意的是,引入中值银行允许仅使用身份标签来精确获取主题特定信息,从而避免了由于监督不足而导致的退化。此外,该框架还包括一个无标签的二阶损失函数来调节变形空间,消除多余信息,从而增强解耦。已经进行了大量实验来证明WSDF的优越性能。我们的代码可在此处访问:https://url.cn/xyz4444

URL

https://arxiv.org/abs/2404.16536

PDF

https://arxiv.org/pdf/2404.16536.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot