Paper Reading AI Learner

DeepFeatureX Net: Deep Features eXtractors based Network for discriminating synthetic from real images

2024-04-24 07:25:36
Orazio Pontorno (1), Luca Guarnera (1), Sebastiano Battiato (1) ((1) University of Catania)

Abstract

Deepfakes, synthetic images generated by deep learning algorithms, represent one of the biggest challenges in the field of Digital Forensics. The scientific community is working to develop approaches that can discriminate the origin of digital images (real or AI-generated). However, these methodologies face the challenge of generalization, that is, the ability to discern the nature of an image even if it is generated by an architecture not seen during training. This usually leads to a drop in performance. In this context, we propose a novel approach based on three blocks called Base Models, each of which is responsible for extracting the discriminative features of a specific image class (Diffusion Model-generated, GAN-generated, or real) as it is trained by exploiting deliberately unbalanced datasets. The features extracted from each block are then concatenated and processed to discriminate the origin of the input image. Experimental results showed that this approach not only demonstrates good robust capabilities to JPEG compression but also outperforms state-of-the-art methods in several generalization tests. Code, models and dataset are available at this https URL.

Abstract (translated)

Deepfakes,由深度学习算法生成的合成图像,是数字取证领域的一个巨大挑战。科学界正在努力开发方法来区分数字图像(真实或由AI生成)的来源。然而,这些方法面临着泛化能力的挑战,即,即使是在训练过程中没有见过的架构中生成图像,也能辨别出图像的性质。通常会导致性能下降。在这种情况下,我们提出了一个基于三个模块的新颖方法,称为基本模型,每个模块负责从训练过程中提取特定图像类别的区分性特征(扩散模型生成,GAN生成或真实)并利用故意不平衡的数据集。从每个模块提取的特征然后进行连接和处理,以区分输入图像的来源。实验结果表明,这种方法不仅表现出对JPEG压缩的良好容错性,而且在几个通用测试中超过了最先进的方法。代码,模型和数据集都可以在上述链接中找到。

URL

https://arxiv.org/abs/2404.15697

PDF

https://arxiv.org/pdf/2404.15697.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot