Abstract
Unveiling the real appearance of retouched faces to prevent malicious users from deceptive advertising and economic fraud has been an increasing concern in the era of digital economics. This article makes the first attempt to investigate the face retouching reversal (FRR) problem. We first collect an FRR dataset, named deepFRR, which contains 50,000 StyleGAN-generated high-resolution (1024*1024) facial images and their corresponding retouched ones by a commercial online API. To our best knowledge, deepFRR is the first FRR dataset tailored for training the deep FRR models. Then, we propose a novel diffusion-based FRR approach (FRRffusion) for the FRR task. Our FRRffusion consists of a coarse-to-fine two-stage network: A diffusion-based Facial Morpho-Architectonic Restorer (FMAR) is constructed to generate the basic contours of low-resolution faces in the first stage, while a Transformer-based Hyperrealistic Facial Detail Generator (HFDG) is designed to create high-resolution facial details in the second stage. Tested on deepFRR, our FRRffusion surpasses the GP-UNIT and Stable Diffusion methods by a large margin in four widespread quantitative metrics. Especially, the de-retouched images by our FRRffusion are visually much closer to the raw face images than both the retouched face images and those restored by the GP-UNIT and Stable Diffusion methods in terms of qualitative evaluation with 85 subjects. These results sufficiently validate the efficacy of our work, bridging the recently-standing gap between the FRR and generic image restoration tasks. The dataset and code are available at this https URL.
Abstract (translated)
在数字经济的时期,揭开修饰前后的脸的真实外观以防止恶意用户进行欺骗性广告和经济欺诈是一个越来越关注的问题。本文是首次调查了脸部修饰反向(FRR)问题。我们首先收集了一个名为deepFRR的数据集,其中包含50,000个由StyleGAN生成的具有1024*1024分辨率的高清(1024*1024)面部图像以及它们由商业在线API修整过的相应图像。据我们所知,deepFRR是第一个针对训练深度FRR模型的FRR数据集。然后,我们提出了一个新颖的扩散为基础的FRR方法(FRRffusion)用于FRR任务。我们的FRRffusion包括一个粗到细的两级网络:首先,通过扩散构建面部形态还原器(FMAR),以生成低分辨率面部的基本轮廓;其次,设计了一个Transformer-based超现实面部细节生成器(HFDG),用于在第二个阶段创建高分辨率面部细节。在deepFRR上测试我们的FRRffusion,我们的FRRffusion在四个广泛的定量指标上超过了GP-UNIT和Stable Diffusion方法。特别是,我们通过FRRffusion生成的去修补过的图像在视觉上与原始面部图像非常接近,而在定量评估中,与GP-UNIT和Stable Diffusion方法相比,修复后的图像在85个受试者中的质量评估结果也相差无几。这些结果充分验证了我们的工作,缩小了FRR和通用图像修复任务之间的最近空白。数据集和代码可在此https URL找到。
URL
https://arxiv.org/abs/2405.07582