Paper Reading AI Learner

FRRffusion: Unveiling Authenticity with Diffusion-Based Face Retouching Reversal

2024-05-13 09:38:49
Fengchuang Xing, Xiaowen Shi, Yuan-Gen Wang, Chunsheng Yang

Abstract

Unveiling the real appearance of retouched faces to prevent malicious users from deceptive advertising and economic fraud has been an increasing concern in the era of digital economics. This article makes the first attempt to investigate the face retouching reversal (FRR) problem. We first collect an FRR dataset, named deepFRR, which contains 50,000 StyleGAN-generated high-resolution (1024*1024) facial images and their corresponding retouched ones by a commercial online API. To our best knowledge, deepFRR is the first FRR dataset tailored for training the deep FRR models. Then, we propose a novel diffusion-based FRR approach (FRRffusion) for the FRR task. Our FRRffusion consists of a coarse-to-fine two-stage network: A diffusion-based Facial Morpho-Architectonic Restorer (FMAR) is constructed to generate the basic contours of low-resolution faces in the first stage, while a Transformer-based Hyperrealistic Facial Detail Generator (HFDG) is designed to create high-resolution facial details in the second stage. Tested on deepFRR, our FRRffusion surpasses the GP-UNIT and Stable Diffusion methods by a large margin in four widespread quantitative metrics. Especially, the de-retouched images by our FRRffusion are visually much closer to the raw face images than both the retouched face images and those restored by the GP-UNIT and Stable Diffusion methods in terms of qualitative evaluation with 85 subjects. These results sufficiently validate the efficacy of our work, bridging the recently-standing gap between the FRR and generic image restoration tasks. The dataset and code are available at this https URL.

Abstract (translated)

在数字经济的时期,揭开修饰前后的脸的真实外观以防止恶意用户进行欺骗性广告和经济欺诈是一个越来越关注的问题。本文是首次调查了脸部修饰反向(FRR)问题。我们首先收集了一个名为deepFRR的数据集,其中包含50,000个由StyleGAN生成的具有1024*1024分辨率的高清(1024*1024)面部图像以及它们由商业在线API修整过的相应图像。据我们所知,deepFRR是第一个针对训练深度FRR模型的FRR数据集。然后,我们提出了一个新颖的扩散为基础的FRR方法(FRRffusion)用于FRR任务。我们的FRRffusion包括一个粗到细的两级网络:首先,通过扩散构建面部形态还原器(FMAR),以生成低分辨率面部的基本轮廓;其次,设计了一个Transformer-based超现实面部细节生成器(HFDG),用于在第二个阶段创建高分辨率面部细节。在deepFRR上测试我们的FRRffusion,我们的FRRffusion在四个广泛的定量指标上超过了GP-UNIT和Stable Diffusion方法。特别是,我们通过FRRffusion生成的去修补过的图像在视觉上与原始面部图像非常接近,而在定量评估中,与GP-UNIT和Stable Diffusion方法相比,修复后的图像在85个受试者中的质量评估结果也相差无几。这些结果充分验证了我们的工作,缩小了FRR和通用图像修复任务之间的最近空白。数据集和代码可在此https URL找到。

URL

https://arxiv.org/abs/2405.07582

PDF

https://arxiv.org/pdf/2405.07582.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot