Abstract
Recent advancements in image manipulation have achieved unprecedented progress in generating photorealistic content, but also simultaneously eliminating barriers to arbitrary manipulation and editing, raising concerns about multimedia authenticity and cybersecurity. However, existing Image Manipulation Detection and Localization (IMDL) methodologies predominantly focus on splicing or copy-move forgeries, lacking dedicated benchmarks for inpainting-based manipulations. To bridge this gap, we present COCOInpaint, a comprehensive benchmark specifically designed for inpainting detection, with three key contributions: 1) High-quality inpainting samples generated by six state-of-the-art inpainting models, 2) Diverse generation scenarios enabled by four mask generation strategies with optional text guidance, and 3) Large-scale coverage with 258,266 inpainted images with rich semantic diversity. Our benchmark is constructed to emphasize intrinsic inconsistencies between inpainted and authentic regions, rather than superficial semantic artifacts such as object shapes. We establish a rigorous evaluation protocol using three standard metrics to assess existing IMDL approaches. The dataset will be made publicly available to facilitate future research in this area.
Abstract (translated)
最近在图像操纵领域的进展实现了生成逼真内容的前所未有的进步,但同时也消除了任意操作和编辑的障碍,引发了对多媒体真实性和网络安全性的担忧。然而,现有的图像操纵检测与定位(IMDL)方法主要关注拼接或复制移动伪造品,缺乏针对基于修复的操纵的专门基准测试。为了弥补这一差距,我们推出了COCOInpaint,这是一个专门为修复检测设计的全面基准,具有三大贡献:1) 由六个最先进的修复模型生成的高质量修复样本;2) 四种掩码生成策略(可选文字引导)实现多样化的生成场景;3) 包含258,266张语义丰富的修复图像的大规模覆盖范围。我们的基准测试着重于突出修复区域和真实区域之间的内在不一致性,而不是表面的语义特征如物体形状。我们建立了三个标准评估指标来严格评估现有的IMDL方法。该数据集将公开发布以促进未来的研究工作。
URL
https://arxiv.org/abs/2504.18361