Paper Reading AI Learner

Fully neuromorphic vision and control for autonomous drone flight

2023-03-15 17:19:45
Federico Paredes-Vallés, Jesse Hagenaars, Julien Dupeyroux, Stein Stroobants, Yingfu Xu, Guido de Croon

Abstract

Biological sensing and processing is asynchronous and sparse, leading to low-latency and energy-efficient perception and action. In robotics, neuromorphic hardware for event-based vision and spiking neural networks promises to exhibit similar characteristics. However, robotic implementations have been limited to basic tasks with low-dimensional sensory inputs and motor actions due to the restricted network size in current embedded neuromorphic processors and the difficulties of training spiking neural networks. Here, we present the first fully neuromorphic vision-to-control pipeline for controlling a freely flying drone. Specifically, we train a spiking neural network that accepts high-dimensional raw event-based camera data and outputs low-level control actions for performing autonomous vision-based flight. The vision part of the network, consisting of five layers and 28.8k neurons, maps incoming raw events to ego-motion estimates and is trained with self-supervised learning on real event data. The control part consists of a single decoding layer and is learned with an evolutionary algorithm in a drone simulator. Robotic experiments show a successful sim-to-real transfer of the fully learned neuromorphic pipeline. The drone can accurately follow different ego-motion setpoints, allowing for hovering, landing, and maneuvering sideways$\unicode{x2014}$even while yawing at the same time. The neuromorphic pipeline runs on board on Intel's Loihi neuromorphic processor with an execution frequency of 200 Hz, spending only 27 $\unicode{x00b5}$J per inference. These results illustrate the potential of neuromorphic sensing and processing for enabling smaller, more intelligent robots.

Abstract (translated)

生物感知和处理是异步和稀疏的,导致低延迟和高能源效率的感知和行动。在机器人领域,基于事件的视景神经可塑性硬件和突触连接神经网络的承诺表现出类似的特点。然而,机器人的实施局限于低维度感官输入和运动指令的任务,由于当前嵌入式神经可塑性处理器的网络规模限制和训练突触连接神经网络的困难和挑战,这些限制已经导致机器人只能执行基本任务。在这里,我们介绍了第一个完整的神经可塑性视控管道,用于控制自由飞行无人机。具体来说,我们训练了一个突触连接神经网络,它接受高维度 raw 事件 视角数据并输出低级别的控制行动,以执行自主视觉飞行。网络的视觉部分由五层和28.8k个神经元组成,将输入的原始事件映射到自我运动估计,并通过自监督学习在真实事件数据上训练。控制部分由一个解码层组成,并在无人机模拟器中通过学习进化算法进行训练。机器人实验表明,成功地将完全学习的神经可塑性管道 Sim-to-real 转移。无人机可以准确地跟随不同的自我运动目标,允许hover、着陆和调整侧面$unicode{x2014}$即使同时yawing。神经可塑性管道在Intel的Loihi神经可塑性处理器上运行,执行频率为200 Hz,每次推理只需要花费27 $unicode{x00b5}$J。这些结果展示了神经可塑性感知和处理的潜力,以使小型、更智能的机器人成为现实。

URL

https://arxiv.org/abs/2303.08778

PDF

https://arxiv.org/pdf/2303.08778.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot