Paper Reading AI Learner

Multi-StyleGS: Stylizing Gaussian Splatting with Multiple Styles

2025-06-07 15:54:34
Yangkai Lin, Jiabao Lei, Kui jia

Abstract

In recent years, there has been a growing demand to stylize a given 3D scene to align with the artistic style of reference images for creative purposes. While 3D Gaussian Splatting(GS) has emerged as a promising and efficient method for realistic 3D scene modeling, there remains a challenge in adapting it to stylize 3D GS to match with multiple styles through automatic local style transfer or manual designation, while maintaining memory efficiency for stylization training. In this paper, we introduce a novel 3D GS stylization solution termed Multi-StyleGS to tackle these challenges. In particular, we employ a bipartite matching mechanism to au tomatically identify correspondences between the style images and the local regions of the rendered images. To facilitate local style transfer, we introduce a novel semantic style loss function that employs a segmentation network to apply distinct styles to various objects of the scene and propose a local-global feature matching to enhance the multi-view consistency. Furthermore, this technique can achieve memory efficient training, more texture details and better color match. To better assign a robust semantic label to each Gaussian, we propose several techniques to regularize the segmentation network. As demonstrated by our comprehensive experiments, our approach outperforms existing ones in producing plausible stylization results and offering flexible editing.

Abstract (translated)

近年来,人们对将给定的3D场景根据参考图像的艺术风格进行美化的需求日益增长。虽然三维高斯点绘(Gaussian Splatting, GS)作为一种高效的方法在现实主义3D场景建模中展现出巨大潜力,但在自动局部风格转移或手动指定的情况下,将其适应以匹配多样的艺术风格,并保持风格化训练中的内存效率仍然面临挑战。为此,在本文中我们提出了一种新的3D GS美化解决方案,命名为Multi-StyleGS,旨在解决这些难题。 具体而言,我们采用双图匹配机制来自动识别样式图像与渲染图像局部区域之间的对应关系。为了促进局部风格转移,我们引入了一种新颖的语义风格损失函数,该函数利用分割网络将不同的艺术风格应用于场景中的各个对象,并提出一种局部-全局特征匹配方法以增强多视角一致性。此外,这种方法能够实现高效内存训练、更多纹理细节和更好的颜色匹配。 为更好地给每个高斯分配稳健的语义标签,我们提出了几种技术来规范分割网络。通过全面的实验表明,我们的方法在生成合理风格化结果和提供灵活编辑方面优于现有方法。

URL

https://arxiv.org/abs/2506.06846

PDF

https://arxiv.org/pdf/2506.06846.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot