Paper Reading AI Learner

SRGS: Super-Resolution 3D Gaussian Splatting

2024-04-16 06:58:30
Xiang Feng, Yongbo He, Yubo Wang, Yan Yang, Zhenzhong Kuang, Yu Jun, Jianping Fan, Jiajun ding

Abstract

Recently, 3D Gaussian Splatting (3DGS) has gained popularity as a novel explicit 3D representation. This approach relies on the representation power of Gaussian primitives to provide a high-quality rendering. However, primitives optimized at low resolution inevitably exhibit sparsity and texture deficiency, posing a challenge for achieving high-resolution novel view synthesis (HRNVS). To address this problem, we propose Super-Resolution 3D Gaussian Splatting (SRGS) to perform the optimization in a high-resolution (HR) space. The sub-pixel constraint is introduced for the increased viewpoints in HR space, exploiting the sub-pixel cross-view information of the multiple low-resolution (LR) views. The gradient accumulated from more viewpoints will facilitate the densification of primitives. Furthermore, a pre-trained 2D super-resolution model is integrated with the sub-pixel constraint, enabling these dense primitives to learn faithful texture features. In general, our method focuses on densification and texture learning to effectively enhance the representation ability of primitives. Experimentally, our method achieves high rendering quality on HRNVS only with LR inputs, outperforming state-of-the-art methods on challenging datasets such as Mip-NeRF 360 and Tanks & Temples. Related codes will be released upon acceptance.

Abstract (translated)

近年来,3D Gaussian Splatting(3DGS)作为一种新颖的显式3D表示方法,已经赢得了人们的关注。这种方法依赖于高斯原色的表示能力来提供高质量的渲染。然而,在低分辨率下优化的基本体素往往表现出稀疏性和纹理不足,这使得实现高分辨率的新颖视图合成(HRNVS)具有挑战性。为解决这个问题,我们提出了超级分辨率3D Gaussian Splatting(SRGS)来在高分辨率(HR)空间中进行优化。引入了亚像素约束来增加HR空间中的视点,利用多个低分辨率(LR)视图的亚像素跨视信息。从更多视点累积的梯度将促进基本体素的密度。此外,将预训练的2D超分辨率模型与亚像素约束集成,使这些密集的基本体素能够学习到真实的纹理特征。总的来说,我们的方法专注于提高基本体素的密度和纹理学习,从而有效增强其表示能力。在实验中,我们的方法在HRNVS仅使用LR输入时,实现了高渲染质量,并在具有挑战性的数据集(如Mip-NeRF 360和Tanks & Temples)上超过了最先进的方法。相关代码将在接受提交时发布。

URL

https://arxiv.org/abs/2404.10318

PDF

https://arxiv.org/pdf/2404.10318.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot