Paper Reading AI Learner

StylizedGS: Controllable Stylization for 3D Gaussian Splatting

2024-04-08 06:32:11
Dingxi Zhang, Zhuoxun Chen, Yu-Jie Yuan, Fang-Lue Zhang, Zhenliang He, Shiguang Shan, Lin Gao

Abstract

With the rapid development of XR, 3D generation and editing are becoming more and more important, among which, stylization is an important tool of 3D appearance editing. It can achieve consistent 3D artistic stylization given a single reference style image and thus is a user-friendly editing way. However, recent NeRF-based 3D stylization methods face efficiency issues that affect the actual user experience and the implicit nature limits its ability to transfer the geometric pattern styles. Additionally, the ability for artists to exert flexible control over stylized scenes is considered highly desirable, fostering an environment conducive to creative exploration. In this paper, we introduce StylizedGS, a 3D neural style transfer framework with adaptable control over perceptual factors based on 3D Gaussian Splatting (3DGS) representation. The 3DGS brings the benefits of high efficiency. We propose a GS filter to eliminate floaters in the reconstruction which affects the stylization effects before stylization. Then the nearest neighbor-based style loss is introduced to achieve stylization by fine-tuning the geometry and color parameters of 3DGS, while a depth preservation loss with other regularizations is proposed to prevent the tampering of geometry content. Moreover, facilitated by specially designed losses, StylizedGS enables users to control color, stylized scale and regions during the stylization to possess customized capabilities. Our method can attain high-quality stylization results characterized by faithful brushstrokes and geometric consistency with flexible controls. Extensive experiments across various scenes and styles demonstrate the effectiveness and efficiency of our method concerning both stylization quality and inference FPS.

Abstract (translated)

随着XR技术的快速发展,3D生成和编辑变得越来越重要,其中,塑性是一种重要的3D外观编辑工具。通过给定单一定式风格图像,它可以实现一致的3D艺术塑性,从而成为一种用户友好的编辑方式。然而,基于NeRF的3D塑性方法面临效率问题,影响了实际用户体验,并且隐含的拓扑学限制了其将几何图案样式转移的能力。此外,艺术家对塑性场景的灵活控制被认为是高度渴望的,促进了创意探索的环境。在本文中,我们引入了StylizedGS,一种基于3D高斯平滑(3DGS)表示的3D神经风格转移框架。3DGS带来了高效率的优势。我们提出了GS滤波器来消除在重构过程中影响塑性效果的浮点。然后,基于最近邻的样式损失来实现通过对3DGS的形和色参数的微调来实现塑性,同时引入了基于其他正则化的深度保持损失,以防止对几何内容进行篡改。此外,通过专门设计的损失功能,StylizedGS使用户能够在塑性过程中控制颜色、塑性比例和区域,具有自定义功能。我们的方法可以实现高质量、具有忠实笔触和几何一致性的艺术塑性,具有灵活的控制。在各种场景和风格的大量实验中,我们证明了我们的方法在塑性和推理每秒帧数方面的有效性和效率。

URL

https://arxiv.org/abs/2404.05220

PDF

https://arxiv.org/pdf/2404.05220.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot