Paper Reading AI Learner

A user model for JND-based video quality assessment: theory and applications

2018-07-28 05:32:29
Haiqiang Wang, Ioannis Katsavounidis, Xinfeng Zhang, Chao Yang, C.-C. Jay Kuo

Abstract

The video quality assessment (VQA) technology has attracted a lot of attention in recent years due to an increasing demand of video streaming services. Existing VQA methods are designed to predict video quality in terms of the mean opinion score (MOS) calibrated by humans in subjective experiments. However, they cannot predict the satisfied user ratio (SUR) of an aggregated viewer group. Furthermore, they provide little guidance to video coding parameter selection, e.g. the Quantization Parameter (QP) of a set of consecutive frames, in practical video streaming services. To overcome these shortcomings, the just-noticeable-difference (JND) based VQA methodology has been proposed as an alternative. It is observed experimentally that the JND location is a normally distributed random variable. In this work, we explain this distribution by proposing a user model that takes both subject variabilities and content variabilities into account. This model is built upon user's capability to discern the quality difference between video clips encoded with different QPs. Moreover, it analyzes video content characteristics to account for inter-content variability. The proposed user model is validated on the data collected in the VideoSet. It is demonstrated that the model is flexible to predict SUR distribution of a specific user group.

Abstract (translated)

由于视频流服务的需求不断增长,近年来视频质量评估(VQA)技术引起了很多关注。现有的VQA方法被设计成根据在主观实验中由人校准的平均意见得分(MOS)来预测视频质量。但是,它们无法预测聚合查看器组的满意用户比率(SUR)。此外,它们对视频编码参数选择提供很少的指导,例如,实际视频流服务中的一组连续帧的量化参数(QP)。为了克服这些缺点,已经提出了基于刚好显着差异(JND)的VQA方法作为替代方案。通过实验观察到JND位置是正态分布的随机变量。在这项工作中,我们通过提出一个既考虑主题变异性又考虑内容变异性的用户模型来解释这种分布。该模型基于用户识别用不同QP编码的视频剪辑之间的质量差异的能力。此外,它分析视频内容特征以解决内容间可变性。建议的用户模型根据VideoSet中收集的数据进行验证。证明该模型可灵活地预测特定用户组的SUR分布。

URL

https://arxiv.org/abs/1807.10894

PDF

https://arxiv.org/pdf/1807.10894.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot