Paper Reading AI Learner

Low-Light Image Enhancement with Wavelet-based Diffusion Models

2023-06-01 03:08:28
Hai Jiang, Ao Luo, Songchen Han, Haoqiang Fan, Shuaicheng Liu

Abstract

Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration. To address these issues, we propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL. Specifically, we present a wavelet-based conditional diffusion model (WCDM) that leverages the generative power of diffusion models to produce results with satisfactory perceptual fidelity. Additionally, it also takes advantage of the strengths of wavelet transformation to greatly accelerate inference and reduce computational resource usage without sacrificing information. To avoid chaotic content and diversity, we perform both forward diffusion and reverse denoising in the training phase of WCDM, enabling the model to achieve stable denoising and reduce randomness during inference. Moreover, we further design a high-frequency restoration module (HFRM) that utilizes the vertical and horizontal details of the image to complement the diagonal information for better fine-grained restoration. Extensive experiments on publicly available real-world benchmarks demonstrate that our method outperforms the existing state-of-the-art methods both quantitatively and visually, and it achieves remarkable improvements in efficiency compared to previous diffusion-based methods. In addition, we empirically show that the application for low-light face detection also reveals the latent practical values of our method.

Abstract (translated)

扩散模型在图像恢复任务中取得了令人瞩目的结果,但在实践中却存在耗时、过度计算资源消耗和不稳定恢复的问题。为了解决这些问题,我们提出了一种稳健且高效的基于扩散的图像增强方法,称为DiffLL。具体来说,我们提出了一种基于Wavelet的条件扩散模型(WCDM),利用扩散模型的生成能力,产生令人满意的感知失真结果。此外,它还利用Wavelet变换的强大特性,极大地加速推断并减少计算资源使用,而无需牺牲信息。为了避免混沌内容和多样性,我们在WCDM的训练阶段进行 forward 扩散和 reverse denoising,使模型能够在推断期间实现稳定的denoising,并减少随机性。此外,我们还设计了高频率恢复模块(HFRM),利用图像的垂直和水平细节,以补充对角信息,以更好地精细恢复。在公开可用的真实世界基准测试中,进行了广泛的实验,证明了我们的方法和以前的基于扩散的方法在量和视觉上都超越了现有最先进的方法,而且相对于以前的扩散方法,在效率方面取得了显著的改进。此外,我们还经验证,用于低光人脸识别的应用也揭示了我们方法的潜在实用价值。

URL

https://arxiv.org/abs/2306.00306

PDF

https://arxiv.org/pdf/2306.00306.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot