Paper Reading AI Learner

D$^2$GS: Depth-and-Density Guided Gaussian Splatting for Stable and Accurate Sparse-View Reconstruction

2025-10-09 17:59:49
Meixi Song, Xin Lin, Dizhe Zhang, Haodong Li, Xiangtai Li, Bo Du, Lu Qi

Abstract

Recent advances in 3D Gaussian Splatting (3DGS) enable real-time, high-fidelity novel view synthesis (NVS) with explicit 3D representations. However, performance degradation and instability remain significant under sparse-view conditions. In this work, we identify two key failure modes under sparse-view conditions: overfitting in regions with excessive Gaussian density near the camera, and underfitting in distant areas with insufficient Gaussian coverage. To address these challenges, we propose a unified framework D$^2$GS, comprising two key components: a Depth-and-Density Guided Dropout strategy that suppresses overfitting by adaptively masking redundant Gaussians based on density and depth, and a Distance-Aware Fidelity Enhancement module that improves reconstruction quality in under-fitted far-field areas through targeted supervision. Moreover, we introduce a new evaluation metric to quantify the stability of learned Gaussian distributions, providing insights into the robustness of the sparse-view 3DGS. Extensive experiments on multiple datasets demonstrate that our method significantly improves both visual quality and robustness under sparse view conditions. The project page can be found at: this https URL.

Abstract (translated)

最近在3D高斯点置法(3D Gaussian Splatting,简称3DGS)方面取得的进展使得能够实现实时、高质量的新视角合成(NVS),并使用明确的三维表示。然而,在稀疏视图条件下,性能下降和不稳定性仍然是主要问题。在这项工作中,我们识别出两种关键的失败模式:在相机附近高密度区域过度拟合以及远距离区域由于高斯分布不足导致欠拟合。 为了解决这些问题,我们提出了一种统一框架D²GS,该框架包含两个关键组件: 1. 深度和密度引导的Dropout策略(Depth-and-Density Guided Dropout):通过自适应地基于密度和深度屏蔽冗余高斯点来抑制过度拟合。 2. 距离感知保真度增强模块(Distance-Aware Fidelity Enhancement module):通过有针对性的监督提高远距离欠拟合区域的重建质量。 此外,我们还引入了一个新的评估指标,用于量化学习到的高斯分布的稳定性,并为稀疏视图下的3DGS提供了关于其鲁棒性的见解。在多个数据集上的广泛实验表明,我们的方法显著提高了稀疏视角条件下的视觉质量和稳健性。项目的网页可以在以下网址找到:this https URL.

URL

https://arxiv.org/abs/2510.08566

PDF

https://arxiv.org/pdf/2510.08566.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot