Paper Reading AI Learner

Perceptual Constancy Constrained Single Opinion Score Calibration for Image Quality Assessment

2024-04-30 14:42:55
Lei Wang, Desen Yuan

Abstract

In this paper, we propose a highly efficient method to estimate an image's mean opinion score (MOS) from a single opinion score (SOS). Assuming that each SOS is the observed sample of a normal distribution and the MOS is its unknown expectation, the MOS inference is formulated as a maximum likelihood estimation problem, where the perceptual correlation of pairwise images is considered in modeling the likelihood of SOS. More specifically, by means of the quality-aware representations learned from the self-supervised backbone, we introduce a learnable relative quality measure to predict the MOS difference between two images. Then, the current image's maximum likelihood estimation towards MOS is represented by the sum of another reference image's estimated MOS and their relative quality. Ideally, no matter which image is selected as the reference, the MOS of the current image should remain unchanged, which is termed perceptual cons tancy constrained calibration (PC3). Finally, we alternatively optimize the relative quality measure's parameter and the current image's estimated MOS via backpropagation and Newton's method respectively. Experiments show that the proposed method is efficient in calibrating the biased SOS and significantly improves IQA model learning when only SOSs are available.

Abstract (translated)

在本文中,我们提出了一种从单个意见分数(SOS)估计图像平均评分(MOS)的高效方法。假设每个SOS是正态分布的观察样本,而MOS是它的未知期望。因此,MOS推理被视为最大似然估计问题,其中考虑了成对图像的感知相关性以建模SOS的概率。具体来说,通过自监督骨架学习到的质量感知表示,我们引入了一个可学习的相对质量度量以预测两个图像之间的MOS差。那么,当前图像对MOS的最大似然估计就可以表示为另一个参考图像的估计MOS和它们之间的相对质量之和。理想情况下,无论选择哪个图像作为参考,当前图像的MOS都应该保持不变,这被称为感知一致性约束调节(PC3)。最后,我们分别通过反向传播和牛顿法对相对质量度量的参数和当前图像的估计MOS进行优化。实验证明,与仅使用SOS时相比,所提出的方法在调节带有偏差SOS方面非常有效,并且当仅可用SOS时,IQA模型的学习显著提高。

URL

https://arxiv.org/abs/2404.19595

PDF

https://arxiv.org/pdf/2404.19595.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot