Abstract
Recent advances in learned image codecs have been extended from human perception toward machine perception. However, progressive image compression with fine granular scalability (FGS)-which enables decoding a single bitstream at multiple quality levels-remains unexplored for machine-oriented codecs. In this work, we propose a novel progressive learned image compression codec for machine perception, PICM-Net, based on trit-plane coding. By analyzing the difference between human- and machine-oriented rate-distortion priorities, we systematically examine the latent prioritization strategies in terms of machine-oriented codecs. To further enhance real-world adaptability, we design an adaptive decoding controller, which dynamically determines the necessary decoding level during inference time to maintain the desired confidence of downstream machine prediction. Extensive experiments demonstrate that our approach enables efficient and adaptive progressive transmission while maintaining high performance in the downstream classification task, establishing a new paradigm for machine-aware progressive image compression.
Abstract (translated)
最近,针对机器感知的图像编码技术在学习型图像编解码领域取得了进展。然而,具有细粒度可伸缩性(FGS)的渐进式图像压缩——允许从单一比特流中以多个质量级别进行解码——对于面向机器的技术而言仍是一个未被探索的研究方向。本文提出了一种基于三值平面编码的新颖渐进式学习型图像压缩编解码器PICM-Net,专门用于机器感知。通过分析人类和机器导向的速率失真优先级之间的差异,我们系统地研究了面向机器的编码器中的潜在优先策略。为了进一步增强现实世界的适应性,我们设计了一个自适应解码控制器,在推理过程中动态确定所需的解码级别,以维持下游机器预测所需的信心水平。 广泛的实验表明,我们的方法能够在保持下游分类任务高性能的同时实现高效且适应性强的渐进式传输,从而为面向机器的渐进式图像压缩建立了一种新的范例。
URL
https://arxiv.org/abs/2512.20070