Paper Reading AI Learner

Hierarchical Disentangled Representation for Invertible Image Denoising and Beyond

2023-01-31 01:24:34
Wenchao Du, Hu Chen, Yi Zhang, H. Yang

Abstract

Image denoising is a typical ill-posed problem due to complex degradation. Leading methods based on normalizing flows have tried to solve this problem with an invertible transformation instead of a deterministic mapping. However, the implicit bijective mapping is not explored well. Inspired by a latent observation that noise tends to appear in the high-frequency part of the image, we propose a fully invertible denoising method that injects the idea of disentangled learning into a general invertible neural network to split noise from the high-frequency part. More specifically, we decompose the noisy image into clean low-frequency and hybrid high-frequency parts with an invertible transformation and then disentangle case-specific noise and high-frequency components in the latent space. In this way, denoising is made tractable by inversely merging noiseless low and high-frequency parts. Furthermore, we construct a flexible hierarchical disentangling framework, which aims to decompose most of the low-frequency image information while disentangling noise from the high-frequency part in a coarse-to-fine manner. Extensive experiments on real image denoising, JPEG compressed artifact removal, and medical low-dose CT image restoration have demonstrated that the proposed method achieves competing performance on both quantitative metrics and visual quality, with significantly less computational cost.

Abstract (translated)

图像去噪是一个由于复杂退化而典型的不可约问题。基于正常化流的主要方法试图使用可逆变换而不是确定性映射来解决这个问题。然而,隐含的双向映射并没有得到充分的探索。受到一个潜在的观察,即图像中的噪声往往出现在高频部分,因此我们提出了一种全可逆去噪方法,将分离学习的理念注入到一般可逆神经网络中,从高频部分分离噪声。更具体地说,我们使用可逆变换将噪声图像分解成干净的频率和混合高频部分,然后从潜在空间中分离特定情况下的噪声和高频成分。通过反向合并无噪声的低和高频率部分,去噪变得易于处理。此外,我们建立了一个灵活的Hierarchical denoising框架,旨在分解大部分低频率图像信息,同时从高频部分分离噪声。在实际应用中,针对真实图像去噪、JPEG压缩的 artifacts移除和医学低剂量CT图像恢复进行了广泛的实验,证明了该方法在 quantitative metrics 和视觉质量方面实现了竞争性能,计算成本显著更低。

URL

https://arxiv.org/abs/2301.13358

PDF

https://arxiv.org/pdf/2301.13358.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot