Paper Reading AI Learner

Bio-inspired Autonomous Exploration Policies with CNN-based Object Detection on Nano-drones

2023-01-28 12:15:01
Lorenzo Lamberti, Luca Bompani, Victor Javier Kartsch, Manuele Rusci, Daniele Palossi, Luca Benini

Abstract

Nano-sized drones, with palm-sized form factor, are gaining relevance in the Internet-of-Things ecosystem. Achieving a high degree of autonomy for complex multi-objective missions (e.g., safe flight, exploration, object detection) is extremely challenging for the onboard chip-set due to tight size, payload (<10g), and power envelope constraints, which strictly limit both memory and computation. Our work addresses this complex problem by combining bio-inspired navigation policies, which rely on time-of-flight distance sensor data, with a vision-based convolutional neural network (CNN) for object detection. Our field-proven nano-drone is equipped with two microcontroller units (MCUs), a single-core ARM Cortex-M4 (STM32) for safe navigation and exploration policies, and a parallel ultra-low power octa-core RISC-V (GAP8) for onboard CNN inference, with a power envelope of just 134mW, including image sensors and external memories. The object detection task achieves a mean average precision of 50% (at 1.6 frame/s) on an in-field collected dataset. We compare four bio-inspired exploration policies and identify a pseudo-random policy to achieve the highest coverage area of 83% in a ~36m^2 unknown room in a 3 minutes flight. By combining the detection CNN and the exploration policy, we show an average detection rate of 90% on six target objects in a never-seen-before environment.

Abstract (translated)

纳米大小的无人机,尺寸类似于手掌,正在物联网生态系统中变得越来越重要。为实现复杂的多目标任务(例如安全飞行、探索、物体检测),对 onboard 芯片 SET 的高度自主权非常具有挑战性,因为尺寸紧凑、负载不到10g,以及能量限制,这两个限制严格限制了内存和计算。我们的工作解决这个问题的方法是将生物启发式导航政策相结合,该政策依赖于飞行距离传感器数据,而依赖于基于视觉的卷积神经网络(CNN)进行物体检测。我们的现场证明的纳米无人机配备了两个微控制器单元(MCUs),一个单核心ARM Cortex-M4(STM32)用于安全飞行和探索政策,以及一个并行的 ultra-low power Octa-core RISC-V(GAP8)用于 onboard CNN 推理,功率限制仅为134mW,包括图像传感器和外部存储器。物体检测任务在一个现场收集的dataset上实现了50%的平均精度(以1.6帧/秒)。我们比较了四个生物启发式探索政策,并识别了一个伪随机政策,以在3分钟内飞行的~36m^2未知房间的83%覆盖率上实现最高覆盖范围。通过将检测 CNN 和探索政策相结合,我们展示了在从未见过的环境中对六个目标物体的平均检测率为90%。

URL

https://arxiv.org/abs/2301.12175

PDF

https://arxiv.org/pdf/2301.12175.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot