Paper Reading AI Learner

Neural Network Assisted Lifting Steps For Improved Fully Scalable Lossy Image Compression in JPEG 2000

2024-03-04 00:01:52
Xinyue Li, Aous Naman, David Taubman

Abstract

This work proposes to augment the lifting steps of the conventional wavelet transform with additional neural network assisted lifting steps. These additional steps reduce residual redundancy (notably aliasing information) amongst the wavelet subbands, and also improve the visual quality of reconstructed images at reduced resolutions. The proposed approach involves two steps, a high-to-low step followed by a low-to-high step. The high-to-low step suppresses aliasing in the low-pass band by using the detail bands at the same resolution, while the low-to-high step aims to further remove redundancy from detail bands, so as to achieve higher energy compaction. The proposed two lifting steps are trained in an end-to-end fashion; we employ a backward annealing approach to overcome the non-differentiability of the quantization and cost functions during back-propagation. Importantly, the networks employed in this paper are compact and with limited non-linearities, allowing a fully scalable system; one pair of trained network parameters are applied for all levels of decomposition and for all bit-rates of interest. By employing the proposed approach within the JPEG 2000 image coding standard, our method can achieve up to 17.4% average BD bit-rate saving over a wide range of bit-rates, while retaining quality and resolution scalability features of JPEG 2000.

Abstract (translated)

本文提出了一种通过添加神经网络辅助提升传统波浪变换的 lifting 步骤来增强其提升步数的方法。这些额外的步骤减少了波浪子带之间的残余冗余(显著是 aliasing 信息),并且还改善了在低分辨率下重构图像的视觉效果。所提出的方法包括两个步骤:从高到低的步骤和从低到高的步骤。从高到低的步骤通过在同一分辨率下使用详细波浪带来抑制低通带中的 aliasing,而从低到高的步骤旨在进一步消除详细波浪带中的冗余,以实现更高的能量压缩。与传统的提升步骤相比,本文提出的两个提升步骤在端到端的方式下进行训练;我们采用反向退化方法来克服在反向传播过程中量化和非线性函数的不可导性。重要的是,本文使用的网络具有紧凑的模型和有限的非线性,允许实现完全可扩展的系统;对于所有分解级别和感兴趣的比特率,我们采用一对训练好的网络参数。通过将所提出的提升方法应用于 JPEG 2000 图像编码标准,我们的方法在广泛的比特率范围内可以实现最高 17.4% 的平均 BD 位率节省,同时保留 JPEG 2000 的质量和分辨率可扩展性特征。

URL

https://arxiv.org/abs/2403.01647

PDF

https://arxiv.org/pdf/2403.01647.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot