Paper Reading AI Learner

E2F-Net: Eyes-to-Face Inpainting via StyleGAN Latent Space

2024-03-18 19:11:34
Ahmad Hassanpour, Fatemeh Jamalbafrani, Bian Yang, Kiran Raja, Raymond Veldhuis, Julian Fierrez

Abstract

Face inpainting, the technique of restoring missing or damaged regions in facial images, is pivotal for applications like face recognition in occluded scenarios and image analysis with poor-quality captures. This process not only needs to produce realistic visuals but also preserve individual identity characteristics. The aim of this paper is to inpaint a face given periocular region (eyes-to-face) through a proposed new Generative Adversarial Network (GAN)-based model called Eyes-to-Face Network (E2F-Net). The proposed approach extracts identity and non-identity features from the periocular region using two dedicated encoders have been used. The extracted features are then mapped to the latent space of a pre-trained StyleGAN generator to benefit from its state-of-the-art performance and its rich, diverse and expressive latent space without any additional training. We further improve the StyleGAN output to find the optimal code in the latent space using a new optimization for GAN inversion technique. Our E2F-Net requires a minimum training process reducing the computational complexity as a secondary benefit. Through extensive experiments, we show that our method successfully reconstructs the whole face with high quality, surpassing current techniques, despite significantly less training and supervision efforts. We have generated seven eyes-to-face datasets based on well-known public face datasets for training and verifying our proposed methods. The code and datasets are publicly available.

Abstract (translated)

面部修复技术,即在面部图像中恢复缺失或受损区域的算法,对于应用如在遮挡场景下进行面部识别和低质量图像分析来说至关重要。这一过程不仅要产生逼真的视觉效果,还应保留个体的身份特征。本文旨在通过一种基于提出的新生成对抗网络(GAN)模型,即Eyes-to-Face Network(E2F-Net),对给定的外侧眼部区域(从眼睛到脸)进行修复。该方法通过使用两个专门编码器从外侧眼部区域提取身份和无关特征。提取的特征随后映射到预训练的StyleGAN生成器的潜在空间,以利用其最先进的性能和丰富的、多样化和表现力的潜在空间而无需额外训练。我们进一步通过新GAN反向优化技术对StyleGAN输出进行优化,以找到在潜在空间中最佳的代码。我们的E2F-Net需要最小训练过程,作为其次要好处,从而降低计算复杂度。通过广泛的实验,我们发现,我们的方法在高质量地重构整个面部,超越现有技术,尽管训练和监督 efforts大大减少。我们已经基于知名公共面部数据集生成七个眼睛-to-face数据集,用于训练和验证我们所提出的方法。代码和数据集都是公开可用的。

URL

https://arxiv.org/abs/2403.12197

PDF

https://arxiv.org/pdf/2403.12197.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot