Paper Reading AI Learner

Simultaneous regression and feature learning for facial landmarking

2019-04-24 13:15:30
Janez Križaj, Peter Peer, Vitomir Štruc, Simon Dobrišek

Abstract

Face alignment (or facial landmarking) is an important task in many face-related applications, ranging from registration, tracking and animation to higher-level classification problems such as face, expression or attribute recognition. While several solutions have been presented in the literature for this task so far, reliably locating salient facial features across a wide range of posses still remains challenging. To address this issue, we propose in this paper a novel method for automatic facial landmark localization in 3D face data designed specifically to address appearance variability caused by significant pose variations. Our method builds on recent cascaded-regression-based methods to facial landmarking and uses a gating mechanism to incorporate multiple linear cascaded regression models each trained for a limited range of poses into a single powerful landmarking model capable of processing arbitrary posed input data. We develop two distinct approaches around the proposed gating mechanism: i) the first uses a gated multiple ridge descent (GRID) mechanism in conjunction with established (hand-crafted) HOG features for face alignment and achieves state-of-the-art landmarking performance across a wide range of facial poses, ii) the second simultaneously learns multiple-descent directions as well as binary features (SMUF) that are optimal for the alignment tasks and in addition to competitive landmarking results also ensures extremely rapid processing. We evaluate both approaches in rigorous experiments on several popular datasets of 3D face images, i.e., the FRGCv2 and Bosphorus 3D Face datasets and image collections F and G from the University of Notre Dame. The results of our evaluation show that both approaches are competitive in comparison to the state-of-the-art, while exhibiting considerable robustness to pose variations.

Abstract (translated)

人脸对齐(或人脸标记)是许多人脸相关应用程序中的一项重要任务,从配准、跟踪和动画到更高级别的分类问题,如人脸、表情或属性识别。尽管迄今为止,文献中已经提出了几种解决方案,但在各种负鼠中可靠地定位突出的面部特征仍然具有挑战性。为了解决这一问题,本文提出了一种新的三维人脸数据中的自动面部标志点定位方法,该方法专门针对显著姿势变化引起的外观变化进行设计。我们的方法建立在最新的基于级联回归的面部标志性方法的基础上,并使用门控机制将多个线性级联回归模型,每个模型都针对有限的姿势范围进行训练,形成一个能够处理任意姿势输入数据的强大的标志性模型。我们围绕所提出的选通机制开发了两种不同的方法:i)第一种方法使用选通多脊下降(网格)机制,结合已建立的(手工制作的)弓形特征进行面部对齐,并在各种面部姿势中实现最先进的标志性性能;ii)第二种方法同时学习mul。倾斜下降方向和二进制特征(SMUF)是最适合校准任务的,除了具有竞争力的标志性结果外,还确保了非常快速的处理。我们对这两种方法进行了严格的实验,对几种流行的3D人脸图像数据集进行了评估,即FRGCv2和Bosphorus 3D人脸数据集以及圣母大学的F和G图像集。我们的评估结果表明,这两种方法相比于最先进的方法具有竞争力,同时显示出相当强的稳健性,以构成变化。

URL

https://arxiv.org/abs/1904.10787

PDF

https://arxiv.org/pdf/1904.10787.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot