Blind Multimodal Quality Assessment: A Brief Survey and A Case Study of Low-light Images

Abstract
Abstract (translated)
URL
PDF

Abstract

Blind image quality assessment (BIQA) aims at automatically and accurately forecasting objective scores for visual signals, which has been widely used to monitor product and service quality in low-light applications, covering smartphone photography, video surveillance, autonomous driving, etc. Recent developments in this field are dominated by unimodal solutions inconsistent with human subjective rating patterns, where human visual perception is simultaneously reflected by multiple sensory information (e.g., sight and hearing). In this article, we present a unique blind multimodal quality assessment (BMQA) of low-light images from subjective evaluation to objective score. To investigate the multimodal mechanism, we first establish a multimodal low-light image quality (MLIQ) database with authentic low-light distortions, containing image and audio modality pairs. Further, we specially design the key modules of BMQA, considering multimodal quality representation, latent feature alignment and fusion, and hybrid self-supervised and supervised learning. Extensive experiments show that our BMQA yields state-of-the-art accuracy on the proposed MLIQ benchmark database. In particular, we also build an independent single-image modality Dark-4K database, which is used to verify its applicability and generalization performance in mainstream unimodal applications. Qualitative and quantitative results on Dark-4K show that BMQA achieves superior performance to existing BIQA approaches as long as a pre-trained quality semantic description model is provided. The proposed framework and two databases as well as the collected BIQA methods and evaluation metrics are made publicly available.

Abstract (translated)

Blind image quality assessment (BIQA) 旨在自动和准确地预测视觉信号的主观评分,该方法被广泛应用于低光应用中的产品质量监控,包括智能手机摄影、视频监视、自动驾驶等。该领域最近的发展主要由单目解决方案与人类主观评价模式不一致的情况主导,人类视觉感知同时由多种感官信息(如视觉和听觉)同时反映。在本文中,我们介绍了一种独特的从主观评价到客观评分的全天候多模态质量评估(BMQA)方法,以研究多模态机制。为了研究多模态机制,我们首先建立了一个全天候低光图像质量(MLIQ)数据库,其中包含真实的低光扭曲,包含图像和音频模态对。此外,我们还特别设计了BMQA的关键模块,考虑多模态质量表示、潜在特征对齐和融合,以及混合自监督和监督学习。广泛的实验表明,我们的BMQA在提出的MLIQ基准数据库上表现出最先进的准确性。特别是,我们还建立了一个独立的单图像模态暗4K数据库,用于验证它在主流单模态应用中的适用性和泛化性能。暗4K数据库的定性和定量结果表明,只要提供预训练的质量语义描述模型,BMQA就能够实现与现有BIQA方法相比更好的性能。 proposed 框架和两个数据库,以及收集的BIQA方法和评估指标,均公开发布。

URL

https://arxiv.org/abs/2303.10369

PDF

https://arxiv.org/pdf/2303.10369.pdf