Paper Reading AI Learner

SA-LUT: Spatial Adaptive 4D Look-Up Table for Photorealistic Style Transfer

2025-06-16 13:25:12
Zerui Gong, Zhonghua Wu, Qingyi Tao, Qinyue Li, Chen Change Loy

Abstract

Photorealistic style transfer (PST) enables real-world color grading by adapting reference image colors while preserving content structure. Existing methods mainly follow either approaches: generation-based methods that prioritize stylistic fidelity at the cost of content integrity and efficiency, or global color transformation methods such as LUT, which preserve structure but lack local adaptability. To bridge this gap, we propose Spatial Adaptive 4D Look-Up Table (SA-LUT), combining LUT efficiency with neural network adaptability. SA-LUT features: (1) a Style-guided 4D LUT Generator that extracts multi-scale features from the style image to predict a 4D LUT, and (2) a Context Generator using content-style cross-attention to produce a context map. This context map enables spatially-adaptive adjustments, allowing our 4D LUT to apply precise color transformations while preserving structural integrity. To establish a rigorous evaluation framework for photorealistic style transfer, we introduce PST50, the first benchmark specifically designed for PST assessment. Experiments demonstrate that SA-LUT substantially outperforms state-of-the-art methods, achieving a 66.7% reduction in LPIPS score compared to 3D LUT approaches, while maintaining real-time performance at 16 FPS for video stylization. Our code and benchmark are available at this https URL

Abstract (translated)

光真实感风格迁移(PST)通过适应参考图像的颜色来实现现实世界的色彩分级,同时保持内容结构的完整性。现有方法主要遵循两种路径:一种是优先考虑风格忠实性的生成方法,但牺牲了内容完整性和效率;另一种是全局颜色变换方法,如查找表(LUT),它保留了结构完整性但缺乏局部适应性。为弥合这一差距,我们提出了空间自适应4D查找表(SA-LUT),将LUT的效率与神经网络的适应能力相结合。 SA-LUT的特点包括: 1. 风格引导的4D LUT生成器:从风格图像中提取多尺度特征以预测一个4D LUT。 2. 上下文生成器:使用内容-样式交叉注意力机制来产生上下文映射。这个上下文映射使得空间自适应调整成为可能,使我们的4D LUT能够执行精确的颜色变换同时保持结构完整性。 为了建立光真实感风格迁移的严格评估框架,我们引入了PST50,这是第一个专门用于PST评估的基准测试。实验结果表明,SA-LUT显著优于现有的最佳方法,在LPIPS评分上比3D LUT方法减少了66.7%,同时在视频着色方面保持了每秒16帧的实时性能。 我们的代码和基准可以在以下网址获得:[此URL](请将方括号中的内容替换为实际链接)。

URL

https://arxiv.org/abs/2506.13465

PDF

https://arxiv.org/pdf/2506.13465.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot