Paper Reading AI Learner

SLAM for Indoor Mapping of Wide Area Construction Environments

2024-04-26 07:42:20
Vincent Ress, Wei Zhang, David Skuddis, Norbert Haala, Uwe Soergel

Abstract

Simultaneous localization and mapping (SLAM), i.e., the reconstruction of the environment represented by a (3D) map and the concurrent pose estimation, has made astonishing progress. Meanwhile, large scale applications aiming at the data collection in complex environments like factory halls or construction sites are becoming feasible. However, in contrast to small scale scenarios with building interiors separated to single rooms, shop floors or construction areas require measures at larger distances in potentially texture less areas under difficult illumination. Pose estimation is further aggravated since no GNSS measures are available as it is usual for such indoor applications. In our work, we realize data collection in a large factory hall by a robot system equipped with four stereo cameras as well as a 3D laser scanner. We apply our state-of-the-art LiDAR and visual SLAM approaches and discuss the respective pros and cons of the different sensor types for trajectory estimation and dense map generation in such an environment. Additionally, dense and accurate depth maps are generated by 3D Gaussian splatting, which we plan to use in the context of our project aiming on the automatic construction and site monitoring.

Abstract (translated)

同时定位与映射(SLAM),即通过(3D)地图重建环境,同时进行姿态估计,已经取得了令人惊讶的进展。与此同时,旨在在复杂环境中进行数据收集的大规模应用变得更加可行。然而,与单间建筑内部的小规模场景相比,车间或建筑区需要在大规模距离内测量在可能纹理不足的区域上的措施。由于通常没有GNSS测量方法,因此对于这种室内应用,姿态估计进一步加剧。在我们的工作中,我们通过配备四个立体摄像头和3D激光扫描仪的机器人系统在大型工厂车间中实现了数据收集。我们应用了最先进的LiDAR和视觉SLAM方法,并讨论了不同传感器类型对于轨迹估计和密集地图生成的优缺点。此外,通过3D高斯扩展生成密集且准确的深度图,我们计划在致力于自动建筑和场地监测的项目中使用。

URL

https://arxiv.org/abs/2404.17215

PDF

https://arxiv.org/pdf/2404.17215.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot