Paper Reading AI Learner

X-SG$^2$S: Safe and Generalizable Gaussian Splatting with X-dimensional Watermarks

2025-02-13 17:59:15
Zihang Cheng, Huiping Zhuang, Chun Li, Xin Meng, Ming Li, Fei Richard Yu

Abstract

3D Gaussian Splatting (3DGS) has been widely used in 3D reconstruction and 3D generation. Training to get a 3DGS scene often takes a lot of time and resources and even valuable inspiration. The increasing amount of 3DGS digital asset have brought great challenges to the copyright protection. However, it still lacks profound exploration targeted at 3DGS. In this paper, we propose a new framework X-SG$^2$S which can simultaneously watermark 1 to 3D messages while keeping the original 3DGS scene almost unchanged. Generally, we have a X-SG$^2$S injector for adding multi-modal messages simultaneously and an extractor for extract them. Specifically, we first split the watermarks into message patches in a fixed manner and sort the 3DGS points. A self-adaption gate is used to pick out suitable location for watermarking. Then use a XD(multi-dimension)-injection heads to add multi-modal messages into sorted 3DGS points. A learnable gate can recognize the location with extra messages and XD-extraction heads can restore hidden messages from the location recommended by the learnable gate. Extensive experiments demonstrated that the proposed X-SG$^2$S can effectively conceal multi modal messages without changing pretrained 3DGS pipeline or the original form of 3DGS parameters. Meanwhile, with simple and efficient model structure and high practicality, X-SG$^2$S still shows good performance in hiding and extracting multi-modal inner structured or unstructured messages. X-SG$^2$S is the first to unify 1 to 3D watermarking model for 3DGS and the first framework to add multi-modal watermarks simultaneous in one 3DGS which pave the wave for later researches.

Abstract (translated)

3D高斯点阵(3D Gaussian Splatting,简称3DGS)在三维重建和生成领域得到了广泛应用。为了获得一个3DGS场景,通常需要消耗大量时间和资源,并且有时还需要有价值的创意灵感。随着3DGS数字资产数量的不断增加,版权保护也面临着巨大的挑战。然而,针对3DGS的研究仍处于初步阶段。在此论文中,我们提出了一种新的框架X-SG$^2$S,它可以同时嵌入1到3维的信息,而几乎不改变原始的3DGS场景。总的来说,我们的框架包含一个X-SG$^2$S注入器来添加多模态信息,并且有一个提取器用于从中提取这些信息。 具体来说,我们首先以固定的方式将水印分割成消息补丁,并对3DGS点进行排序。使用自适应门选择适合嵌入水印的位置。然后利用XD(多维)-注射头向已排序的3DGS点中添加多模态信息。一个可学习的门能够识别带有额外信息的位置,而XD提取头可以从由该可学习门推荐的位置中恢复隐藏的信息。 广泛的实验表明,所提出的X-SG$^2$S能够在不改变预训练的3DGS流程或原始参数形式的情况下有效隐藏多模态消息。同时,由于模型结构简单且高效,并具有高度实用性,X-SG$^2$S在隐藏和提取多模式内部结构化或非结构化信息方面表现良好。 作为首个统一1到3维水印嵌入模型的框架以及首个在同一3DGS中同时添加多模态水印的框架,X-SG$^2$S为后续研究铺平了道路。

URL

https://arxiv.org/abs/2502.10475

PDF

https://arxiv.org/pdf/2502.10475.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot