Paper Reading AI Learner

Adapting Pretrained Networks for Image Quality Assessment on High Dynamic Range Displays

2024-05-01 17:57:12
Andrei Chubarau, Hyunjin Yoo, Tara Akhavan, James Clark

Abstract

Conventional image quality metrics (IQMs), such as PSNR and SSIM, are designed for perceptually uniform gamma-encoded pixel values and cannot be directly applied to perceptually non-uniform linear high-dynamic-range (HDR) colors. Similarly, most of the available datasets consist of standard-dynamic-range (SDR) images collected in standard and possibly uncontrolled viewing conditions. Popular pre-trained neural networks are likewise intended for SDR inputs, restricting their direct application to HDR content. On the other hand, training HDR models from scratch is challenging due to limited available HDR data. In this work, we explore more effective approaches for training deep learning-based models for image quality assessment (IQA) on HDR data. We leverage networks pre-trained on SDR data (source domain) and re-target these models to HDR (target domain) with additional fine-tuning and domain adaptation. We validate our methods on the available HDR IQA datasets, demonstrating that models trained with our combined recipe outperform previous baselines, converge much quicker, and reliably generalize to HDR inputs.

Abstract (translated)

传统的图像质量度量(IQMs)如 PSNR 和 SSIM 是为了感知上均匀伽马编码的像素值而设计的,无法直接应用于感知非均匀线性高动态范围(HDR)颜色。同样,大多数可用的数据集包括在标准和非控制条件下收集的标准动态范围(SDR)图像。受欢迎的预训练神经网络也是为了 SDR 输入而设计的,从而限制了它们对 HDR 内容的直接应用。另一方面,从零开始训练 HDR 模型具有挑战性,因为可用的 HDR 数据有限。在本文中,我们探讨了用于训练基于深度学习的图像质量评估(IQA)模型的更有效方法。我们利用预训练在 SDR 数据上的网络(源域),并将这些模型重新定位到 HDR(目标域)进行微调和域适应。我们在可用的 HDR IQA 数据集上评估我们的方法,证明了使用我们结合的食谱训练的模型比以前的基线更优,收敛速度更快,并且能够可靠地将 HDR 输入。

URL

https://arxiv.org/abs/2405.00670

PDF

https://arxiv.org/pdf/2405.00670.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot