Paper Reading AI Learner

GScream: Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal

2024-04-21 14:42:10
Yuxin Wang, Qianyi Wu, Guofeng Zhang, Dan Xu

Abstract

This paper tackles the intricate challenge of object removal to update the radiance field using the 3D Gaussian Splatting. The main challenges of this task lie in the preservation of geometric consistency and the maintenance of texture coherence in the presence of the substantial discrete nature of Gaussian primitives. We introduce a robust framework specifically designed to overcome these obstacles. The key insight of our approach is the enhancement of information exchange among visible and invisible areas, facilitating content restoration in terms of both geometry and texture. Our methodology begins with optimizing the positioning of Gaussian primitives to improve geometric consistency across both removed and visible areas, guided by an online registration process informed by monocular depth estimation. Following this, we employ a novel feature propagation mechanism to bolster texture coherence, leveraging a cross-attention design that bridges sampling Gaussians from both uncertain and certain areas. This innovative approach significantly refines the texture coherence within the final radiance field. Extensive experiments validate that our method not only elevates the quality of novel view synthesis for scenes undergoing object removal but also showcases notable efficiency gains in training and rendering speeds.

Abstract (translated)

本文解决了使用3D高斯平铺更新辐射场的问题,这是通过保留几何一致性和在Gaussian原始数据中保持纹理一致性的复杂挑战。这项任务的主要挑战在于保留通过平铺Gaussian原始数据而获得的平滑性,同时保持纹理一致性,这在很大程度上是由Gaussian原始数据的显著离散性造成的。为了克服这些障碍,我们引入了一个专门设计的稳健框架。我们方法的关键洞察力是提高可见和不可见区域之间的信息交流,从而实现几何和纹理的恢复。我们的方法从通过单目深度估计的在线注册过程优化Gaussian原始数据的位置开始。接着,我们采用一种新颖的特征传播机制来增强纹理一致性,并利用一种跨注意设计,将来自不确定和确定区域的Gaussian采样进行连接。这种创新方法在最终辐射场中对纹理一致性进行了显著改进。大量实验证实,我们的方法不仅提高了进行物体删除的场景中新颖视觉合成质量,而且在训练和渲染速度方面显著展示了效率提升。

URL

https://arxiv.org/abs/2404.13679

PDF

https://arxiv.org/pdf/2404.13679.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot