Paper Reading AI Learner

PKU-AIGIQA-4K: A Perceptual Quality Assessment Database for Both Text-to-Image and Image-to-Image AI-Generated Images

2024-04-29 03:57:43
Jiquan Yuan, Fanyi Yang, Jihe Li, Xinyan Cao, Jinming Che, Jinlong Lin, Xixin Cao
       

Abstract

In recent years, image generation technology has rapidly advanced, resulting in the creation of a vast array of AI-generated images (AIGIs). However, the quality of these AIGIs is highly inconsistent, with low-quality AIGIs severely impairing the visual experience of users. Due to the widespread application of AIGIs, the AI-generated image quality assessment (AIGIQA), aimed at evaluating the quality of AIGIs from the perspective of human perception, has garnered increasing interest among scholars. Nonetheless, current research has not yet fully explored this field. We have observed that existing databases are limited to images generated from single scenario settings. Databases such as AGIQA-1K, AGIQA-3K, and AIGCIQA2023, for example, only include images generated by text-to-image generative models. This oversight highlights a critical gap in the current research landscape, underscoring the need for dedicated databases catering to image-to-image scenarios, as well as more comprehensive databases that encompass a broader range of AI-generated image scenarios. Addressing these issues, we have established a large scale perceptual quality assessment database for both text-to-image and image-to-image AIGIs, named PKU-AIGIQA-4K. We then conduct a well-organized subjective experiment to collect quality labels for AIGIs and perform a comprehensive analysis of the PKU-AIGIQA-4K database. Regarding the use of image prompts during the training process, we propose three image quality assessment (IQA) methods based on pre-trained models that include a no-reference method NR-AIGCIQA, a full-reference method FR-AIGCIQA, and a partial-reference method PR-AIGCIQA. Finally, leveraging the PKU-AIGIQA-4K database, we conduct extensive benchmark experiments and compare the performance of the proposed methods and the current IQA methods.

Abstract (translated)

近年来,图像生成技术快速发展,导致产生了大量AI生成的图像(AIGIs)。然而,这些AIGIs的质量差异很大,低质量的AIGIs严重地损害了用户的使用体验。由于AIGIs的广泛应用,从人类的角度评估AIGI质量的人工智能图像质量评估(AIGIQA)受到了越来越多的关注。然而,目前的 research 尚未完全探索这个领域。我们观察到,现有的数据库仅限于从单一场景设置生成的图像。例如,AGIQA-1K,AGIQA-3K和AIGCIQA2023等数据库仅包括由文本到图像生成模型的图像。这一缺陷突显了当前研究格局中的关键空白,强调了需要针对图像到图像场景建立专门的数据库以及更全面的涵盖更广泛AI生图像场景的数据库。为解决这些问题,我们建立了一个大规模的主观质量评估数据库,名为PKU-AIGIQA-4K。然后,我们进行了一个组织良好的主观实验,收集了AIGIs的质量标签,并对PKU-AIGIQA-4K数据库进行了全面分析。关于在训练过程中使用图像提示的问题,我们提出了三种基于预训练模型的图像质量评估(IQA)方法,包括无参考方法NR-AIGCIQA,完整参考方法FR-AIGCIQA和部分参考方法PR-AIGCIQA。最后,利用PKU-AIGIQA-4K数据库,我们进行了广泛的基准实验,并比较了所提出方法和现有IQA方法的性能。

URL

https://arxiv.org/abs/2404.18409

PDF

https://arxiv.org/pdf/2404.18409.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot