Abstract
Photo enhancement plays a crucial role in augmenting the visual aesthetics of a photograph. In recent years, photo enhancement methods have either focused on enhancement performance, producing powerful models that cannot be deployed on edge devices, or prioritized computational efficiency, resulting in inadequate performance for real-world applications. To this end, this paper introduces a pyramid network called LLF-LUT++, which integrates global and local operators through closed-form Laplacian pyramid decomposition and reconstruction. This approach enables fast processing of high-resolution images while also achieving excellent performance. Specifically, we utilize an image-adaptive 3D LUT that capitalizes on the global tonal characteristics of downsampled images, while incorporating two distinct weight fusion strategies to achieve coarse global image enhancement. To implement this strategy, we designed a spatial-frequency transformer weight predictor that effectively extracts the desired distinct weights by leveraging frequency features. Additionally, we apply local Laplacian filters to adaptively refine edge details in high-frequency components. After meticulously redesigning the network structure and transformer model, LLF-LUT++ not only achieves a 2.64 dB improvement in PSNR on the HDR+ dataset, but also further reduces runtime, with 4K resolution images processed in just 13 ms on a single GPU. Extensive experimental results on two benchmark datasets further show that the proposed approach performs favorably compared to state-of-the-art methods. The source code will be made publicly available at this https URL.
Abstract (translated)
照片增强在提升摄影作品的视觉美感方面扮演着关键角色。近年来,照片增强方法要么侧重于提高性能,导致生成的强大模型无法部署到边缘设备上;要么优先考虑计算效率,导致实际应用中的表现不佳。为此,本文提出了一种名为LLF-LUT++的金字塔网络,通过封闭形式的拉普拉斯金字塔分解和重构,整合了全局和局部操作器。这种方法能够在处理高分辨率图像时实现快速运算的同时,也达到了优异的表现效果。 具体而言,我们利用了一个基于下采样图像整体色调特性的自适应3D查找表(LUT),并通过两种不同的权重融合策略实现了粗略的全局图像增强。为了实施这一策略,我们设计了一种空间频率变换器权重预测器,通过利用频率特性有效地提取所需的独特权重。此外,我们还应用了局部拉普拉斯滤波器来自适应地精炼高频分量中的边缘细节。 经过对网络结构和变换模型的精心重构后,LLF-LUT++不仅在HDR+数据集上实现了2.64 dB的峰值信噪比(PSNR)提升,而且进一步减少了运行时间,在单个GPU上处理4K分辨率图像仅需13毫秒。通过两个基准数据集上的广泛实验结果表明,所提出的方法与最新方法相比表现出色。 源代码将在以下网址公开发布:[此链接](https://this-url.com/)(注意替换实际的URL)。
URL
https://arxiv.org/abs/2510.11613