Paper Reading AI Learner

Autonomous Air Traffic Controller: A Deep Multi-Agent Reinforcement Learning Approach

2019-05-02 21:03:27
Marc Brittain, Peng Wei

Abstract

Air traffic control is a real-time safety-critical decision making process in highly dynamic and stochastic environments. In today's aviation practice, a human air traffic controller monitors and directs many aircraft flying through its designated airspace sector. With the fast growing air traffic complexity in traditional (commercial airliners) and low-altitude (drones and eVTOL aircraft) airspace, an autonomous air traffic control system is needed to accommodate high density air traffic and ensure safe separation between aircraft. We propose a deep multi-agent reinforcement learning framework that is able to identify and resolve conflicts between aircraft in a high-density, stochastic, and dynamic en-route sector with multiple intersections and merging points. The proposed framework utilizes an actor-critic model, A2C that incorporates the loss function from Proximal Policy Optimization (PPO) to help stabilize the learning process. In addition we use a centralized learning, decentralized execution scheme where one neural network is learned and shared by all agents in the environment. We show that our framework is both scalable and efficient for large number of incoming aircraft to achieve extremely high traffic throughput with safety guarantee. We evaluate our model via extensive simulations in the BlueSky environment. Results show that our framework is able to resolve 99.97% and 100% of all conflicts both at intersections and merging points, respectively, in extreme high-density air traffic scenarios.

Abstract (translated)

空中交通管制是高度动态和随机环境下的一个实时安全关键决策过程。在今天的航空实践中,一个人工空中交通管制员监控和指挥许多飞机在其指定的空域飞行。随着传统(商用飞机)和低空(无人机和EVTOL飞机)空域空中交通复杂度的快速增长,需要一个自主的空中交通控制系统来适应高密度的空中交通,并确保飞机之间的安全分离。我们提出了一个深入的多智能体强化学习框架,该框架能够识别和解决具有多个交叉点和合并点的高密度、随机和动态航路段中飞机之间的冲突。该框架采用了行为批评模型A2C,它结合了来自近端策略优化(PPO)的损失函数,以帮助稳定学习过程。此外,我们使用集中学习、分散执行方案,其中一个神经网络由环境中的所有代理学习和共享。我们表明,我们的框架既可扩展又高效,适用于大量的进港飞机,以实现极高的吞吐量和安全保证。我们通过在蓝天环境中进行广泛的模拟来评估我们的模型。结果表明,在极端高密度空中交通情况下,我们的框架能够分别解决交叉口和合流点99.97%和100%的冲突。

URL

https://arxiv.org/abs/1905.01303

PDF

https://arxiv.org/pdf/1905.01303.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot