Paper Reading AI Learner

Real-time Monocular Visual Odometry for Turbid and Dynamic Underwater Environments

2018-07-03 08:46:56
Maxime Ferrera (1, 2), Julien Moras (1), Pauline Trouvé-Peloux (1), Vincent Creuze (2) ((1) DTIS, ONERA, Université Paris Saclay, (2) LIRMM)

Abstract

In the context of robotic underwater operations, the visual degradations induced by the medium properties make difficult the exclusive use of cameras for localization purpose. Hence, most localization methods are based on expensive navigational sensors associated with acoustic positioning. On the other hand, visual odometry and visual SLAM have been exhaustively studied for aerial or terrestrial applications, but state-of-the-art algorithms fail underwater. In this paper we tackle the problem of using a simple low-cost camera for underwater localization and propose a new monocular visual odometry method dedicated to the underwater environment. We evaluate different tracking methods and show that optical flow based tracking is more suited to underwater images than classical approaches based on descriptors. We also propose a keyframe-based visual odometry approach highly relying on nonlinear optimization. The proposed algorithm has been assessed on both simulated and real underwater datasets and outperforms state-of-the-art visual SLAM methods under many of the most challenging conditions. The main application of this work is the localization of Remotely Operated Vehicles (ROVs) used for underwater archaeological missions but the developed system can be used in any other applications as long as visual information is available.

Abstract (translated)

在机器人水下操作的背景下,由介质特性引起的视觉退化使得摄像机专用于本地化目的变得困难。因此,大多数定位方法基于与声学定位相关联的昂贵的导航传感器。另一方面,视觉测距和视觉SLAM已经针对航空或地面应用进行了详尽的研究,但最先进的算法在水下失败。在本文中,我们解决了使用简单的低成本相机进行水下定位的问题,并提出了一种专用于水下环境的新型单目视觉测距方法。我们评估不同的跟踪方法,并表明基于光学流的跟踪比基于描述符的经典方法更适合于水下图像。我们还提出了一种基于关键帧的视觉测距方法,高度依赖于非线性优化。所提出的算法已经在模拟和真实水下数据集上进行了评估,并且在许多最具挑战性的条件下优于最先进的视觉SLAM方法。这项工作的主要应用是用于水下考古任务的远程操作车辆(ROV)的本地化,但只要有可用的视觉信息,开发的系统就可以用于任何其他应用。

URL

https://arxiv.org/abs/1806.05842

PDF

https://arxiv.org/pdf/1806.05842.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot