Paper Reading AI Learner

SmartSplat: Feature-Smart Gaussians for Scalable Compression of Ultra-High-Resolution Images

2025-12-23 14:00:55
Linfei Li, Lin Zhang, Zhong Wang, Ying Shen

Abstract

Recent advances in generative AI have accelerated the production of ultra-high-resolution visual content, posing significant challenges for efficient compression and real-time decoding on end-user devices. Inspired by 3D Gaussian Splatting, recent 2D Gaussian image models improve representation efficiency, yet existing methods struggle to balance compression ratio and reconstruction fidelity in ultra-high-resolution scenarios. To address this issue, we propose SmartSplat, a highly adaptive and feature-aware GS-based image compression framework that supports arbitrary image resolutions and compression ratios. SmartSplat leverages image-aware features such as gradients and color variances, introducing a Gradient-Color Guided Variational Sampling strategy together with an Exclusion-based Uniform Sampling scheme to improve the non-overlapping coverage of Gaussian primitives in pixel space. In addition, we propose a Scale-Adaptive Gaussian Color Sampling method to enhance color initialization across scales. Through joint optimization of spatial layout, scale, and color initialization, SmartSplat efficiently captures both local structures and global textures using a limited number of Gaussians, achieving high reconstruction quality under strong compression. Extensive experiments on DIV8K and a newly constructed 16K dataset demonstrate that SmartSplat consistently outperforms state-of-the-art methods at comparable compression ratios and exceeds their compression limits, showing strong scalability and practical applicability. The code is publicly available at this https URL.

Abstract (translated)

最近在生成式AI领域的进展加速了超高分辨率视觉内容的生产,这对终端设备上的高效压缩和实时解码带来了重大挑战。受3D高斯点云启发,近期出现了一些2D高斯图像模型,这些模型提高了表示效率,但现有的方法难以在超高清场景中平衡压缩比与重建保真度之间的关系。为解决这一问题,我们提出了一种名为SmartSplat的高度适应性和特征感知的基于高斯点云(GS)的图像压缩框架,该框架支持任意分辨率和压缩比例的图像。 SmartSplat利用诸如梯度和颜色变化等图像特性,并引入了梯度-颜色引导变分采样策略以及基于排除机制的均匀采样方案,以提高像素空间中非重叠高斯原语覆盖范围。此外,我们还提出了一种尺度自适应高斯颜色采样方法来增强不同比例下的颜色初始化。 通过在空间布局、比例和颜色初始化方面进行联合优化,SmartSplat能够使用有限数量的高斯分布有效地捕捉局部结构和全局纹理,在强压缩条件下实现高质量重建。对DIV8K数据集及我们新构建的一个16K分辨率的数据集进行广泛的实验表明,相比现有最佳方法,在相同压缩比下,SmartSplat性能更优,并且超越了它们的压缩限制,显示出强大的可扩展性和实用性。代码可以在[此链接](https://example.com)公开获取(请将URL替换为实际可用地址)。

URL

https://arxiv.org/abs/2512.20377

PDF

https://arxiv.org/pdf/2512.20377.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot