Paper Reading AI Learner

Mesh-based Photorealistic and Real-time 3D Mapping for Robust Visual Perception of Autonomous Underwater Vehicle

2024-04-29 03:15:21
Jungwoo Lee, Younggun Cho

Abstract

This paper proposes a photorealistic real-time dense 3D mapping system that utilizes a learning-based image enhancement method and mesh-based map representation. Due to the characteristics of the underwater environment, where problems such as hazing and low contrast occur, it is hard to apply conventional simultaneous localization and mapping (SLAM) methods. Furthermore, for sensitive tasks like inspecting cracks, photorealistic mapping is very important. However, the behavior of Autonomous Underwater Vehicle (AUV) is computationally constrained. In this paper, we utilize a neural network-based image enhancement method to improve pose estimation and mapping quality and apply a sliding window-based mesh expansion method to enable lightweight, fast, and photorealistic mapping. To validate our results, we utilize real-world and indoor synthetic datasets. We performed qualitative validation with the real-world dataset and quantitative validation by modeling images from the indoor synthetic dataset as underwater scenes.

Abstract (translated)

本文提出了一种利用基于学习的图像增强方法和基于网格的地图表示的等距实时三维映射系统。由于水下环境的特性,例如雾和低对比度问题,因此很难应用传统的同时定位和映射(SLAM)方法。此外,对于诸如检查裂纹等敏感任务,等距实时映射非常重要。然而,自主水下车辆(AUV)的行为是计算受限的。在本文中,我们利用基于神经网络的图像增强方法来提高姿态估计和映射质量,并采用滑动窗口基础的网格扩展方法来实现轻量、快速和等距实时映射。为了验证我们的结果,我们利用真实世界和室内合成数据集。我们通过真实世界数据集进行定性评估,并通过将室内合成数据集中的图像建模为水下场景进行定量评估。

URL

https://arxiv.org/abs/2404.18395

PDF

https://arxiv.org/pdf/2404.18395.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot