Paper Reading AI Learner

Bridging the Gap Between Saliency Prediction and Image Quality Assessment

2024-05-08 12:04:43
Kirillov Alexey, Andrey Moskalenko, Dmitriy Vatolin

Abstract

Over the past few years, deep neural models have made considerable advances in image quality assessment (IQA). However, the underlying reasons for their success remain unclear, owing to the complex nature of deep neural networks. IQA aims to describe how the human visual system (HVS) works and to create its efficient approximations. On the other hand, Saliency Prediction task aims to emulate HVS via determining areas of visual interest. Thus, we believe that saliency plays a crucial role in human perception. In this work, we conduct an empirical study that reveals the relation between IQA and Saliency Prediction tasks, demonstrating that the former incorporates knowledge of the latter. Moreover, we introduce a novel SACID dataset of saliency-aware compressed images and conduct a large-scale comparison of classic and neural-based IQA methods. All supplementary code and data will be available at the time of publication.

Abstract (translated)

在过去的几年里,深度神经网络在图像质量评估(IQA)方面取得了显著的进步。然而,由于深度神经网络的复杂性,其成功背后的原因仍然不明确。IQA 的目标描述了人视觉系统(HVS)的工作,并旨在创建其有效的近似。另一方面,Saliency 预测任务旨在通过确定视觉兴趣区域来模仿 HVS。因此,我们认为 高亮在人类感知中扮演着关键角色。在这项工作中,我们进行了一项实证研究,揭示了 IQA 和 Saliency 预测任务之间的关系,证明了前一个包含了后一个的知识。此外,我们还引入了一个名为 SACID 的适用于高亮度的压缩图像的新 SACID 数据集,并对基于经典方法和神经网络的 IQA 方法进行了大规模比较。所有补充代码和数据将在发表时提供。

URL

https://arxiv.org/abs/2405.04997

PDF

https://arxiv.org/pdf/2405.04997.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot