Abstract
Learned image compression codecs have recently achieved impressive compression performances surpassing the most efficient image coding architectures. However, most approaches are trained to minimize rate and distortion which often leads to unsatisfactory visual results at low bitrates since perceptual metrics are not taken into account. In this paper, we show that conditional diffusion models can lead to promising results in the generative compression task when used as a decoder, and that, given a compressed representation, they allow creating new tradeoff points between distortion and perception at the decoder side based on the sampling method.
Abstract (translated)
近年来,学习到的图像压缩编码器在压缩性能上已经达到了令人印象深刻的水平,超过了最有效的图像编码架构。然而,大多数方法都是通过最小化率和失真来训练的,这往往导致低比特率下的视觉效果不令人满意,因为感知指标没有被考虑在内。在本文中,我们证明了条件扩散模型作为解码器在生成压缩任务中可以实现有前景的结果,并且,在压缩表示的基础上,它们可以在解码器端根据采样方法创建新的权衡点,从而在失真和感知之间实现新的平衡。
URL
https://arxiv.org/abs/2403.02887