Paper Reading AI Learner

SeFFeC: Semantic Facial Feature Control for Fine-grained Face Editing

2024-03-20 20:47:53
Florian Strohm, Mihai B\^ace, Markus Kaltenecker, Andreas Bulling

Abstract

We propose Semantic Facial Feature Control (SeFFeC) - a novel method for fine-grained face shape editing. Our method enables the manipulation of human-understandable, semantic face features, such as nose length or mouth width, which are defined by different groups of facial landmarks. In contrast to existing methods, the use of facial landmarks enables precise measurement of the facial features, which then enables training SeFFeC without any manually annotated labels. SeFFeC consists of a transformer-based encoder network that takes a latent vector of a pre-trained generative model and a facial feature embedding as input, and learns to modify the latent vector to perform the desired face edit operation. To ensure that the desired feature measurement is changed towards the target value without altering uncorrelated features, we introduced a novel semantic face feature loss. Qualitative and quantitative results show that SeFFeC enables precise and fine-grained control of 23 facial features, some of which could not previously be controlled by other methods, without requiring manual annotations. Unlike existing methods, SeFFeC also provides deterministic control over the exact values of the facial features and more localised and disentangled face edits.

Abstract (translated)

我们提出了 Semantic Facial Feature Control (SeFFeC) - 一种用于细粒度面部形状编辑的新方法。我们的方法允许用户操纵可理解、语义的面部特征,如鼻子长度或嘴巴宽度,这些特征由不同的面部标志组定义。与现有方法相比,使用面部标志进行操作使得可以精确测量面部特征,从而在没有任何手动注释标签的情况下训练 SeFFeC。SeFFeC 由一个基于 Transformer 的编码器网络组成,该网络接受预训练生成模型的潜在向量和一个面部特征嵌入作为输入,并学会修改潜在向量以执行所需的面部编辑操作。为了确保所需的特征测量值朝目标值改变而不会改变无关特征,我们引入了一种新的语义面部特征损失。定性和定量的结果表明,SeFFeC 能够精确控制 23 个面部特征,其中一些特征以前不能由其他方法控制,而无需手动注释。与现有方法不同,SeFFeC 还提供了对面部特征确切值的确定性和更局部和分离的面部编辑的控制。

URL

https://arxiv.org/abs/2403.13972

PDF

https://arxiv.org/pdf/2403.13972.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot