Paper Reading AI Learner

Autonomous Implicit Indoor Scene Reconstruction with Frontier Exploration

2024-04-16 01:59:03
Jing Zeng, Yanxu Li, Jiahao Sun, Qi Ye, Yunlong Ran, Jiming Chen

Abstract

Implicit neural representations have demonstrated significant promise for 3D scene reconstruction. Recent works have extended their applications to autonomous implicit reconstruction through the Next Best View (NBV) based method. However, the NBV method cannot guarantee complete scene coverage and often necessitates extensive viewpoint sampling, particularly in complex scenes. In the paper, we propose to 1) incorporate frontier-based exploration tasks for global coverage with implicit surface uncertainty-based reconstruction tasks to achieve high-quality reconstruction. and 2) introduce a method to achieve implicit surface uncertainty using color uncertainty, which reduces the time needed for view selection. Further with these two tasks, we propose an adaptive strategy for switching modes in view path planning, to reduce time and maintain superior reconstruction quality. Our method exhibits the highest reconstruction quality among all planning methods and superior planning efficiency in methods involving reconstruction tasks. We deploy our method on a UAV and the results show that our method can plan multi-task views and reconstruct a scene with high quality.

Abstract (translated)

隐式神经表示在3D场景重建方面显示出巨大的潜力。最近的工作将应用扩展到通过基于下一个最好视角(NBV)的方法实现自主隐式重建。然而,NBV方法无法保证完全覆盖场景,通常需要进行广泛的视点采样,尤其是在复杂场景中。在本文中,我们提出了一种方法,将基于前沿的探索任务与基于隐式表面不确定性 based 的重建任务相结合以实现高质量的重建。我们还引入了一种使用颜色不确定性实现隐式表面不确定性的方法,以减少视点选择的时间。更进一步,这两种任务使我们在视路路径规划中采用自适应策略,以减少时间和保持卓越的重建质量。我们的方法在所有规划方法中表现出最高的重建质量,并且在涉及重建任务的方法中具有卓越的规划效率。我们将我们的方法应用于无人机,结果表明,我们的方法可以规划多任务视图并具有高质感的场景重建。

URL

https://arxiv.org/abs/2404.10218

PDF

https://arxiv.org/pdf/2404.10218.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot