Paper Reading AI Learner

FreqBlender: Enhancing DeepFake Detection by Blending Frequency Knowledge

2024-04-22 04:41:42
Hanzhe Li, Jiaran Zhou, Bin Li, Junyu Dong, Yuezun Li

Abstract

Generating synthetic fake faces, known as pseudo-fake faces, is an effective way to improve the generalization of DeepFake detection. Existing methods typically generate these faces by blending real or fake faces in color space. While these methods have shown promise, they overlook the simulation of frequency distribution in pseudo-fake faces, limiting the learning of generic forgery traces in-depth. To address this, this paper introduces {\em FreqBlender}, a new method that can generate pseudo-fake faces by blending frequency knowledge. Specifically, we investigate the major frequency components and propose a Frequency Parsing Network to adaptively partition frequency components related to forgery traces. Then we blend this frequency knowledge from fake faces into real faces to generate pseudo-fake faces. Since there is no ground truth for frequency components, we describe a dedicated training strategy by leveraging the inner correlations among different frequency knowledge to instruct the learning process. Experimental results demonstrate the effectiveness of our method in enhancing DeepFake detection, making it a potential plug-and-play strategy for other methods.

Abstract (translated)

生成合成假脸(伪假脸)是提高DeepFake检测的一般化程度的有效方法。现有的方法通常通过在颜色空间中混合真实或假人脸来生成这些脸。虽然这些方法显示出一定的效果,但它们忽略了伪假脸上模拟频率分布,限制了对深度伪造迹的学习。为了解决这个问题,本文引入了{\em FreqBlender},一种通过在频率域中混合频率知识来生成伪假脸的新方法。具体来说,我们研究了主要频率成分,并提出了一种基于频率相关的伪造迹的频率解析网络,以便适应地分割与伪造痕迹相关的频率分量。然后我们将伪造脸上的频率知识与真实脸混合,生成伪假脸。由于没有频率成分的 ground truth,我们描述了一种通过利用不同频率知识之间的内关联来指导学习过程的专用训练策略。实验结果表明,我们的方法可以有效地增强DeepFake检测,成为其他方法的潜在插件和play策略。

URL

https://arxiv.org/abs/2404.13872

PDF

https://arxiv.org/pdf/2404.13872.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot