Paper Reading AI Learner

GenVidBench: A Challenging Benchmark for Detecting AI-Generated Video

2025-01-20 08:58:56
Zhenliang Ni, Qiangyu Yan, Mouxiao Huang, Tianning Yuan, Yehui Tang, Hailin Hu, Xinghao Chen, Yunhe Wang

Abstract

The rapid advancement of video generation models has made it increasingly challenging to distinguish AI-generated videos from real ones. This issue underscores the urgent need for effective AI-generated video detectors to prevent the dissemination of false information through such videos. However, the development of high-performance generative video detectors is currently impeded by the lack of large-scale, high-quality datasets specifically designed for generative video detection. To this end, we introduce GenVidBench, a challenging AI-generated video detection dataset with several key advantages: 1) Cross Source and Cross Generator: The cross-generation source mitigates the interference of video content on the detection. The cross-generator ensures diversity in video attributes between the training and test sets, preventing them from being overly similar. 2) State-of-the-Art Video Generators: The dataset includes videos from 8 state-of-the-art AI video generators, ensuring that it covers the latest advancements in the field of video generation. 3) Rich Semantics: The videos in GenVidBench are analyzed from multiple dimensions and classified into various semantic categories based on their content. This classification ensures that the dataset is not only large but also diverse, aiding in the development of more generalized and effective detection models. We conduct a comprehensive evaluation of different advanced video generators and present a challenging setting. Additionally, we present rich experimental results including advanced video classification models as baselines. With the GenVidBench, researchers can efficiently develop and evaluate AI-generated video detection models. Datasets and code are available at this https URL.

Abstract (translated)

视频生成模型的快速进步使得区分AI生成的视频和真实视频变得越来越困难。这一问题突显了开发有效的AI生成视频检测器以防止通过此类视频传播虚假信息的迫切需求。然而,由于缺乏专门为视频生成检测设计的大规模高质量数据集,高性能生成式视频检测器的发展目前受到了阻碍。为此,我们引入了一个名为GenVidBench的数据集,这是一个具有挑战性的AI生成视频检测数据集,并具备以下几个关键优势: 1. **跨来源与跨生成器**:跨生成源的特性减少了视频内容对检测结果的影响。而跨生成器则确保了训练集和测试集中视频属性之间的多样性,防止它们过于相似。 2. **最先进的视频生成器**:该数据集包括来自8种最先进AI视频生成器的视频,涵盖了视频生成领域的最新进展。 3. **丰富的语义信息**:GenVidBench中的视频从多个维度进行分析,并根据内容分类到不同的语义类别中。这种分类确保了数据集不仅规模庞大而且多样性高,有助于开发更通用和有效的检测模型。 我们对不同的高级视频生成器进行了全面的评估,并提供了一个具有挑战性的设置。此外,还提供了包括先进视频分类模型在内的丰富实验结果作为基准参考。通过GenVidBench,研究人员可以高效地开发和评估AI生成视频检测模型。数据集和代码可在以下链接获取:[此网址](请将"this https URL"替换为实际的URL)。

URL

https://arxiv.org/abs/2501.11340

PDF

https://arxiv.org/pdf/2501.11340.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot