Abstract
Perceptual image quality assessment (IQA) is the task of predicting the visual quality of an image as perceived by a human observer. Current state-of-the-art techniques are based on deep representations trained in discriminative manner. Such representations may ignore visually important features, if they are not predictive of class labels. Recent generative models successfully learn low-dimensional representations using auto-encoding and have been argued to preserve better visual features. Here we leverage existing auto-encoders and propose VAE-QA, a simple and efficient method for predicting image quality in the presence of a full-reference. We evaluate our approach on four standard benchmarks and find that it significantly improves generalization across datasets, has fewer trainable parameters, a smaller memory footprint and faster run time.
Abstract (translated)
感知图像质量评估(IQA)是预测人类观察者对图像视觉质量的感知。 目前的最佳技术基于在区分性方式下训练的深度表示。 这样的表示可能忽略视觉上重要的特征,如果它们不能预测类标签。 最近采用自动编码学习的生成模型成功学习低维表示,并被认为是保留更好视觉特征的有效方法。 因此,我们利用现有的自动编码器,并提出了VAE-QA,一种简单而有效的在完整参考下预测图像质量的方法。我们在四个标准基准上评估我们的方法,发现它显著提高了数据集的泛化,具有更少的训练参数,更小的内存开销和更快的运行时间。
URL
https://arxiv.org/abs/2404.18178