Paper Reading AI Learner

High-resolution Photo Enhancement in Real-time: A Laplacian Pyramid Network

2025-10-13 16:52:32
Feng Zhang, Haoyou Deng, Zhiqiang Li, Lida Li, Bin Xu, Qingbo Lu, Zisheng Cao, Minchen Wei, Changxin Gao, Nong Sang, Xiang Bai

Abstract

Photo enhancement plays a crucial role in augmenting the visual aesthetics of a photograph. In recent years, photo enhancement methods have either focused on enhancement performance, producing powerful models that cannot be deployed on edge devices, or prioritized computational efficiency, resulting in inadequate performance for real-world applications. To this end, this paper introduces a pyramid network called LLF-LUT++, which integrates global and local operators through closed-form Laplacian pyramid decomposition and reconstruction. This approach enables fast processing of high-resolution images while also achieving excellent performance. Specifically, we utilize an image-adaptive 3D LUT that capitalizes on the global tonal characteristics of downsampled images, while incorporating two distinct weight fusion strategies to achieve coarse global image enhancement. To implement this strategy, we designed a spatial-frequency transformer weight predictor that effectively extracts the desired distinct weights by leveraging frequency features. Additionally, we apply local Laplacian filters to adaptively refine edge details in high-frequency components. After meticulously redesigning the network structure and transformer model, LLF-LUT++ not only achieves a 2.64 dB improvement in PSNR on the HDR+ dataset, but also further reduces runtime, with 4K resolution images processed in just 13 ms on a single GPU. Extensive experimental results on two benchmark datasets further show that the proposed approach performs favorably compared to state-of-the-art methods. The source code will be made publicly available at this https URL.

Abstract (translated)

照片增强在提升摄影作品的视觉美感方面扮演着关键角色。近年来,照片增强方法要么侧重于提高性能,导致生成的强大模型无法部署到边缘设备上;要么优先考虑计算效率,导致实际应用中的表现不佳。为此,本文提出了一种名为LLF-LUT++的金字塔网络,通过封闭形式的拉普拉斯金字塔分解和重构,整合了全局和局部操作器。这种方法能够在处理高分辨率图像时实现快速运算的同时,也达到了优异的表现效果。 具体而言,我们利用了一个基于下采样图像整体色调特性的自适应3D查找表(LUT),并通过两种不同的权重融合策略实现了粗略的全局图像增强。为了实施这一策略,我们设计了一种空间频率变换器权重预测器,通过利用频率特性有效地提取所需的独特权重。此外,我们还应用了局部拉普拉斯滤波器来自适应地精炼高频分量中的边缘细节。 经过对网络结构和变换模型的精心重构后,LLF-LUT++不仅在HDR+数据集上实现了2.64 dB的峰值信噪比(PSNR)提升,而且进一步减少了运行时间,在单个GPU上处理4K分辨率图像仅需13毫秒。通过两个基准数据集上的广泛实验结果表明,所提出的方法与最新方法相比表现出色。 源代码将在以下网址公开发布:[此链接](https://this-url.com/)(注意替换实际的URL)。

URL

https://arxiv.org/abs/2510.11613

PDF

https://arxiv.org/pdf/2510.11613.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot