Paper Reading AI Learner

Flying through a narrow gap using neural network: an end-to-end planning and control approach

2019-03-21 16:19:05
Jiarong Lin, Luqi Wang, Fei Gao, Shaojie Shen, Fu Zhang

Abstract

In this paper, we investigate the problem of enabling a drone to fly through a tilted narrow gap, without a traditional planning and control pipeline. To this end, we propose an end-to-end policy network, which imitates from the traditional pipeline and is fine-tuned using reinforcement learning. Unlike previous works which plan dynamical feasible trajectories using motion primitives and track the generated trajectory by a geometric controller, our proposed method is an end-to-end approach which takes the flight scenario as input and directly outputs thrust-attitude control commands for the quadrotor. Key contributions of our paper are: 1) presenting an imitate-reinforce training framework. 2) flying through a narrow gap using an end-to-end policy network, showing that learning based method can also address the highly dynamic control problem as the traditional pipeline does (see attached video: https://www.youtube.com/watch?v=jU1qRcLdjx0). 3) propose a robust imitation of an optimal trajectory generator using multilayer perceptrons. 4) show how reinforcement learning can improve the performance of imitation learning, and the potential to achieve higher performance over the model-based method.

Abstract (translated)

在本文中,我们研究了无人机在没有传统规划和控制管线的情况下,通过倾斜窄缝飞行的问题。为此,我们提出了一个端到端的策略网络,它模仿传统的管道,并通过强化学习进行微调。与以往的工作不同的是,我们提出的方法是以飞行场景为输入,直接输出四旋翼推力姿态控制指令的端到端方法,该方法利用运动原语规划动态可行轨迹,并通过几何控制器跟踪生成的轨迹。本文的主要贡献是:1)提出了模拟强化训练框架。2)使用端到端的策略网络跨越狭窄的鸿沟,这表明基于学习的方法也可以像传统的管道那样解决高度动态的控制问题(见附件视频:https://www.youtube.com/watch?v=ju1krcdjx0)。3)利用多层感知器对最优轨迹发生器进行鲁棒仿真。4)展示强化学习如何提高模仿学习的性能,以及与基于模型的方法相比实现更高性能的潜力。

URL

https://arxiv.org/abs/1903.09088

PDF

https://arxiv.org/pdf/1903.09088.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot