Paper Reading AI Learner

MVF-Net: Multi-View 3D Face Morphable Model Regression

2019-04-09 05:42:11
Fanzi Wu, Linchao Bao, Yajing Chen, Yonggen Ling, Yibing Song, Songnan Li, King Ngi Ngan, Wei Liu

Abstract

We address the problem of recovering the 3D geometry of a human face from a set of facial images in multiple views. While recent studies have shown impressive progress in 3D Morphable Model (3DMM) based facial reconstruction, the settings are mostly restricted to a single view. There is an inherent drawback in the single-view setting: the lack of reliable 3D constraints can cause unresolvable ambiguities. We in this paper explore 3DMM-based shape recovery in a different setting, where a set of multi-view facial images are given as input. A novel approach is proposed to regress 3DMM parameters from multi-view inputs with an end-to-end trainable Convolutional Neural Network (CNN). Multiview geometric constraints are incorporated into the network by establishing dense correspondences between different views leveraging a novel self-supervised view alignment loss. The main ingredient of the view alignment loss is a differentiable dense optical flow estimator that can backpropagate the alignment errors between an input view and a synthetic rendering from another input view, which is projected to the target view through the 3D shape to be inferred. Through minimizing the view alignment loss, better 3D shapes can be recovered such that the synthetic projections from one view to another can better align with the observed image. Extensive experiments demonstrate the superiority of the proposed method over other 3DMM methods.

Abstract (translated)

我们解决了从多个视图中的一组面部图像恢复人脸的三维几何图形的问题。虽然最近的研究已经显示了基于三维变形模型(3dmm)的面部重建的惊人进展,但这些设置大多局限于单一视图。单视图设置有一个固有的缺点:缺乏可靠的三维约束会导致无法解决的模糊性。本文以一组多视角人脸图像为输入,探讨了不同环境下基于3mm的形状恢复方法。提出了一种利用端到端可训练卷积神经网络(CNN)对多视点输入3dmm参数进行回归的新方法。利用一种新的自监督视场对准损耗,通过在不同视点之间建立密集的对应关系,将多视点几何约束引入到网络中。视场对准损失的主要成分是一个可辨别的密集光流估计量,它可以从另一个输入视图反向传播输入视图和合成渲染之间的对准误差,该误差通过待推断的3D形状投射到目标视图。通过最小化视图对齐损失,可以恢复更好的三维形状,从而使从一个视图到另一个视图的合成投影能够更好地与观察到的图像对齐。大量实验表明,该方法优于其它3DMM方法。

URL

https://arxiv.org/abs/1904.04473

PDF

https://arxiv.org/pdf/1904.04473.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot