Paper Reading AI Learner

Learning Content-Weighted Deep Image Compression

2019-04-01 09:40:37
Mu Li, Wangmeng Zuo, Shuhang Gu, Jane You, David Zhang

Abstract

Learning-based lossy image compression usually involves the joint optimization of rate-distortion performance. Most existing methods adopt spatially invariant bit length allocation and incorporate discrete entropy approximation to constrain compression rate. Nonetheless, the information content is spatially variant, where the regions with complex and salient structures generally are more essential to image compression. Taking the spatial variation of image content into account, this paper presents a content-weighted encoder-decoder model, which involves an importance map subnet to produce the importance mask for locally adaptive bit rate allocation. Consequently, the summation of importance mask can thus be utilized as an alternative of entropy estimation for compression rate control. Furthermore, the quantized representations of the learned code and importance map are still spatially dependent, which can be losslessly compressed using arithmetic coding. To compress the codes effectively and efficiently, we propose a trimmed convolutional network to predict the conditional probability of quantized codes. Experiments show that the proposed method can produce visually much better results, and performs favorably in comparison with deep and traditional lossy image compression approaches.

Abstract (translated)

基于学习的有损图像压缩通常涉及速率失真性能的联合优化。现有的方法大多采用空间不变的位长分配,并结合离散熵近似来约束压缩率。尽管如此,信息内容在空间上是变化的,其中具有复杂和突出结构的区域通常对图像压缩更为重要。考虑到图像内容的空间变化,提出了一种内容加权的编码器-解码器模型,该模型包含一个重要映射子网,用于产生局部自适应比特率分配的重要掩码。因此,重要性掩模的求和可以作为压缩速率控制的熵估计的替代方法。此外,学习码和重要性图的量化表示仍然是空间依赖的,可以使用算术编码进行无损压缩。为了有效地压缩编码,我们提出了一种修剪卷积网络来预测量化编码的条件概率。实验表明,与传统的有损图像压缩方法和深度压缩方法相比,该方法在视觉上能获得较好的效果。

URL

https://arxiv.org/abs/1904.00664

PDF

https://arxiv.org/pdf/1904.00664.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot