Paper Reading AI Learner

QVRF: A Quantization-error-aware Variable Rate Framework for Learned Image Compression

2023-03-10 07:08:24
Kedeng Tong, Yaojun Wu, Yue Li, Kai Zhang, Li Zhang, Xin Jin

Abstract

Learned image compression has exhibited promising compression performance, but variable bitrates over a wide range remain a challenge. State-of-the-art variable rate methods compromise the loss of model performance and require numerous additional parameters. In this paper, we present a Quantization-error-aware Variable Rate Framework (QVRF) that utilizes a univariate quantization regulator a to achieve wide-range variable rates within a single model. Specifically, QVRF defines a quantization regulator vector coupled with predefined Lagrange multipliers to control quantization error of all latent representation for discrete variable rates. Additionally, the reparameterization method makes QVRF compatible with a round quantizer. Exhaustive experiments demonstrate that existing fixed-rate VAE-based methods equipped with QVRF can achieve wide-range continuous variable rates within a single model without significant performance degradation. Furthermore, QVRF outperforms contemporary variable-rate methods in rate-distortion performance with minimal additional parameters.

Abstract (translated)

学习的图像压缩表现出令人期望的压缩性能,但广泛的可变比特率仍然是一个挑战。最先进的可变速率方法牺牲了模型性能,并要求大量的额外参数。在本文中,我们提出了一个量化误差aware的可变速率框架(QVRF),它使用单个编码器来实现在一个模型中实现广泛的可变速率。具体来说,QVRF定义了一个量化控制器向量,与预先定义的拉普拉斯移量一起控制离散变量率的所有潜在表示的量化误差。此外,重参数化方法使QVRF与圆形量化器兼容。充分的实验结果表明,现有固定速率VAE-based方法配备了QVRF可以在一个模型中实现广泛的连续可变速率,而没有明显的性能下降。此外,QVRF在比率失真性能方面比 contemporary 可变速率方法出色,仅需要少量的额外参数。

URL

https://arxiv.org/abs/2303.05744

PDF

https://arxiv.org/pdf/2303.05744.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot