Abstract
While the diffusion transformer (DiT) has become a focal point of interest in recent years, its application in low-light image enhancement remains a blank area for exploration. Current methods recover the details from low-light images while inevitably amplifying the noise in images, resulting in poor visual quality. In this paper, we firstly introduce DiT into the low-light enhancement task and design a novel Structure-guided Diffusion Transformer based Low-light image enhancement (SDTL) framework. We compress the feature through wavelet transform to improve the inference efficiency of the model and capture the multi-directional frequency band. Then we propose a Structure Enhancement Module (SEM) that uses structural prior to enhance the texture and leverages an adaptive fusion strategy to achieve more accurate enhancement effect. In Addition, we propose a Structure-guided Attention Block (SAB) to pay more attention to texture-riched tokens and avoid interference from noisy areas in noise prediction. Extensive qualitative and quantitative experiments demonstrate that our method achieves SOTA performance on several popular datasets, validating the effectiveness of SDTL in improving image quality and the potential of DiT in low-light enhancement tasks.
Abstract (translated)
虽然扩散变压器(DiT)近年来成为研究热点,但在低光图像增强领域的应用仍然是一片待开发的领域。现有的方法在恢复低光照图像细节的同时,不可避免地放大了图像中的噪声,导致视觉质量较差。在这篇论文中,我们首次将DiT引入到低光照增强任务,并设计了一个新颖的基于结构引导的扩散变压器(SDTL)框架来进行低光图像增强。我们通过小波变换压缩特征以提高模型的推理效率并捕捉多方向频率带。然后,我们提出了一种结构增强模块(SEM),该模块利用结构先验来增强纹理,并采用自适应融合策略实现更准确的增强效果。此外,我们还提出了一个结构引导注意力块(SAB),用于更加关注富含纹理的标记,并在噪声预测时避免来自噪声区域的干扰。 大量的定性和定量实验表明,在几个流行的基准数据集上,我们的方法达到了最先进的性能,验证了SDTL框架在提升图像质量方面的有效性以及DiT在低光照增强任务中的潜力。
URL
https://arxiv.org/abs/2504.15054