Paper Reading AI Learner

Recognizing and Tracking High-Level, Human-Meaningful Navigation Features of Occupancy Grid Maps

2019-03-08 21:07:10
Payam Nikdel, Richard Vaughan

Abstract

This paper describes a system whereby a robot detects and track human-meaningful navigational cues as it navigates in an indoor environment. It is intended as the sensor front-end for a mobile robot system that can communicate its navigational context with human users. From simulated LiDAR scan data we construct a set of 2D occupancy grid bitmaps, then hand-label these with human-scale navigational features such as closed doors, open corridors and intersections. We train a Convolutional Neural Network (CNN) to recognize these features on input bitmaps. In our demonstration system, these features are detected at every time step then passed to a tracking module that does frame-to-frame data association to improve detection accuracy and identify stable unique features. We evaluate the system in both simulation and the real world. We compare the performance of using input occupancy grids obtained directly from LiDAR data, or incrementally constructed with SLAM, and their combination.

Abstract (translated)

本文介绍了一种机器人在室内环境中导航时检测和跟踪人类有意义的导航提示的系统。它是用于移动机器人系统的传感器前端,可以与人类用户通信其导航环境。从模拟激光雷达扫描数据中,我们构建了一组二维占用网格位图,然后用人体尺度的导航特征手工标记这些位图,如关闭的门、开放的走廊和交叉口。我们训练卷积神经网络(CNN)来识别输入位图上的这些特征。在我们的演示系统中,这些特征在每个时间步都被检测到,然后传递给跟踪模块,跟踪模块进行帧到帧的数据关联,以提高检测精度并识别稳定的唯一特征。我们在仿真和现实世界中对系统进行评估。我们比较了使用直接从激光雷达数据获得的输入占用网格,或使用SLAM增量构造的输入占用网格及其组合的性能。

URL

https://arxiv.org/abs/1903.03669

PDF

https://arxiv.org/pdf/1903.03669.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot