Paper Reading AI Learner

A Variational Observation Model of 3D Object for Probabilistic Semantic SLAM

2018-09-14 02:27:58
H. W. Yu, B. H. Le

Abstract

We present a Bayesian object observation model for complete probabilistic semantic SLAM. Recent studies on object detection and feature extraction have become important for scene understanding and 3D mapping. However, 3D shape of the object is too complex to formulate the probabilistic observation model; therefore, performing the Bayesian inference of the object-oriented features as well as their pose is less considered. Besides, when the robot equipped with an RGB mono camera only observes the projected single view of an object, a significant amount of the 3D shape information is abandoned. Due to these limitations, semantic SLAM and viewpoint-independent loop closure using volumetric 3D object shape is challenging. In order to enable the complete formulation of probabilistic semantic SLAM, we approximate the observation model of a 3D object with a tractable distribution. We also estimate the variational likelihood from the 2D image of the object to exploit its observed single view. In order to evaluate the proposed method, we perform pose and feature estimation, and demonstrate that the automatic loop closure works seamlessly without additional loop detector in various environments.

Abstract (translated)

我们提出了一个完整概率语义SLAM的贝叶斯对象观测模型。最近关于物体检测和特征提取的研究对于场景理解和3D绘图变得重要。然而,物体的三维形状太复杂,无法形成概率观测模型;因此,较少考虑执行面向对象特征的贝叶斯推断以及它们的姿势。此外,当配备RGB单色相机的机器人仅观察物体的投影单个视图时,放弃了大量的3D形状信息。由于这些限制,使用体积3D对象形状的语义SLAM和与视点无关的循环闭合具有挑战性。为了能够完整地形成概率语义SLAM,我们近似具有易处理分布的3D对象的观察模型。我们还从对象的2D图像估计变化似然性以利用其观察到的单个视图。为了评估所提出的方法,我们执行姿势和特征估计,并证明自动循环闭合无缝地工作,而无需在各种环境中使用额外的环路检测器。

URL

https://arxiv.org/abs/1809.05225

PDF

https://arxiv.org/pdf/1809.05225.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot