Abstract
We address the problem of recovering the 3D geometry of a human face from a set of facial images in multiple views. While recent studies have shown impressive progress in 3D Morphable Model (3DMM) based facial reconstruction, the settings are mostly restricted to a single view. There is an inherent drawback in the single-view setting: the lack of reliable 3D constraints can cause unresolvable ambiguities. We in this paper explore 3DMM-based shape recovery in a different setting, where a set of multi-view facial images are given as input. A novel approach is proposed to regress 3DMM parameters from multi-view inputs with an end-to-end trainable Convolutional Neural Network (CNN). Multiview geometric constraints are incorporated into the network by establishing dense correspondences between different views leveraging a novel self-supervised view alignment loss. The main ingredient of the view alignment loss is a differentiable dense optical flow estimator that can backpropagate the alignment errors between an input view and a synthetic rendering from another input view, which is projected to the target view through the 3D shape to be inferred. Through minimizing the view alignment loss, better 3D shapes can be recovered such that the synthetic projections from one view to another can better align with the observed image. Extensive experiments demonstrate the superiority of the proposed method over other 3DMM methods.
Abstract (translated)
我们解决了从多个视图中的一组面部图像恢复人脸的三维几何图形的问题。虽然最近的研究已经显示了基于三维变形模型(3dmm)的面部重建的惊人进展,但这些设置大多局限于单一视图。单视图设置有一个固有的缺点:缺乏可靠的三维约束会导致无法解决的模糊性。本文以一组多视角人脸图像为输入,探讨了不同环境下基于3mm的形状恢复方法。提出了一种利用端到端可训练卷积神经网络(CNN)对多视点输入3dmm参数进行回归的新方法。利用一种新的自监督视场对准损耗,通过在不同视点之间建立密集的对应关系,将多视点几何约束引入到网络中。视场对准损失的主要成分是一个可辨别的密集光流估计量,它可以从另一个输入视图反向传播输入视图和合成渲染之间的对准误差,该误差通过待推断的3D形状投射到目标视图。通过最小化视图对齐损失,可以恢复更好的三维形状,从而使从一个视图到另一个视图的合成投影能够更好地与观察到的图像对齐。大量实验表明,该方法优于其它3DMM方法。
URL
https://arxiv.org/abs/1904.04473