Paper Reading AI Learner

TS-Diff: Two-Stage Diffusion Model for Low-Light RAW Image Enhancement

2025-05-07 09:35:05
Yi Li, Zhiyuan Zhang, Jiangnan Xia, Jianghan Cheng, Qilong Wu, Junwei Li, Yibin Tian, Hui Kong

Abstract

This paper presents a novel Two-Stage Diffusion Model (TS-Diff) for enhancing extremely low-light RAW images. In the pre-training stage, TS-Diff synthesizes noisy images by constructing multiple virtual cameras based on a noise space. Camera Feature Integration (CFI) modules are then designed to enable the model to learn generalizable features across diverse virtual cameras. During the aligning stage, CFIs are averaged to create a target-specific CFI$^T$, which is fine-tuned using a small amount of real RAW data to adapt to the noise characteristics of specific cameras. A structural reparameterization technique further simplifies CFI$^T$ for efficient deployment. To address color shifts during the diffusion process, a color corrector is introduced to ensure color consistency by dynamically adjusting global color distributions. Additionally, a novel dataset, QID, is constructed, featuring quantifiable illumination levels and a wide dynamic range, providing a comprehensive benchmark for training and evaluation under extreme low-light conditions. Experimental results demonstrate that TS-Diff achieves state-of-the-art performance on multiple datasets, including QID, SID, and ELD, excelling in denoising, generalization, and color consistency across various cameras and illumination levels. These findings highlight the robustness and versatility of TS-Diff, making it a practical solution for low-light imaging applications. Source codes and models are available at this https URL

Abstract (translated)

这篇论文提出了一种新颖的两阶段扩散模型(TS-Diff),用于增强极低光照条件下的RAW图像。在预训练阶段,TS-Diff通过构建基于噪声空间的多个虚拟相机来合成带噪图像,并设计了相机特征集成(CFI)模块以使模型能够学习跨不同虚拟相机的一般化特性。在对齐阶段,平均计算出针对特定目标的CFI$^T$,并通过少量的真实RAW数据进行微调,使其适应特定相机的噪声特点。此外,还引入了一种结构重构技术来简化CFI$^T$,以便于高效部署。为了应对扩散过程中出现的颜色偏移问题,论文中设计了一个颜色校正器,通过动态调整全局颜色分布来确保色彩的一致性。 为解决低光照条件下的训练和评估需求,构造了一个具有可量化照明级别及宽广动态范围的新数据集QID,提供了全面的基准测试环境。实验结果显示,在包括QID、SID和ELD在内的多个数据集中,TS-Diff在去噪能力、泛化能力和颜色一致性方面表现优异,横跨不同相机型号与光照条件均能发挥稳定性能。这些发现突显了TS-Diff模型的强大适应性和实用性,使其成为极低光照成像应用中的理想解决方案。 源代码和模型可在以下网址获得:[https URL](注意实际使用时需替换为正确的URL)。

URL

https://arxiv.org/abs/2505.04281

PDF

https://arxiv.org/pdf/2505.04281.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot