Paper Reading AI Learner

DSSLIC: Deep Semantic Segmentation-based Layered Image Compression

2018-07-11 09:03:51
Mohammad Akbari, Jie Liang, Jingning Han

Abstract

Deep learning has revolutionized many computer vision fields in the last few years, including learning-based image compression. In this paper, we propose a deep semantic segmentation-based layered image compression (DSSLIC) framework in which the semantic segmentation map of the input image is obtained and encoded as the base layer of the bit-stream. A compact representation of the input image is also generated and encoded as the first enhancement layer. The segmentation map and the compact version of the image are then employed to obtain a coarse reconstruction of the image. The residual between the input and the coarse reconstruction is additionally encoded as another enhancement layer. Experimental results show that the proposed framework outperforms the H.265/HEVC-based BPG and other codecs in both PSNR and MS-SSIM metrics across a wide range of bit rates in RGB domain. Besides, since semantic segmentation map is included in the bit-stream, the proposed scheme can facilitate many other tasks such as image search and object-based adaptive image compression.

Abstract (translated)

深度学习在过去几年中彻底改变了许多计算机视觉领域,包括基于学习的图像压缩。在本文中,我们提出了一种基于深度语义分割的分层图像压缩(DSSLIC)框架,其中获得输入图像的语义分割图并将其编码为比特流的基础层。还生成输入图像的紧凑表示并将其编码为第一增强层。然后采用分割图和图像的紧凑版本来获得图像的粗略重建。输入和粗略重建之间的残差另外被编码为另一增强层。实验结果表明,所提出的框架在RGB域中的各种比特率上优于基于H.265 / HEVC的BPG和PSNR和MS-SSIM度量中的其他编解码器。此外,由于语义分割图包含在比特流中,所提出的方案可以促进许多其他任务,例如图像搜索和基于对象的自适应图像压缩。

URL

https://arxiv.org/abs/1806.03348

PDF

https://arxiv.org/pdf/1806.03348.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot