Paper Reading AI Learner

Autonomous Navigation at the Nano-Scale: Algorithms, Architectures, and Constraints

2026-01-19 17:38:15
Mahmud S. Zango, Jianglin Lan

Abstract

Autonomous navigation for nano-scale unmanned aerial vehicles (nano-UAVs) is governed by extreme Size, Weight, and Power (SWaP) constraints (with the weight < 50 g and sub-100 mW onboard processor), distinguishing it fundamentally from standard robotic paradigms. This review synthesizes the state-of-the-art in sensing, computing, and control architectures designed specifically for these sub- 100mW computational envelopes. We critically analyse the transition from classical geometry-based methods to emerging "Edge AI" paradigms, including quantized deep neural networks deployed on ultra-low-power System-on-Chips (SoCs) and neuromorphic event-based control. Beyond algorithms, we evaluate the hardware-software co-design requisite for autonomy, covering advancements in dense optical flow, optimized Simultaneous Localization and Mapping (SLAM), and learning-based flight control. While significant progress has been observed in visual navigation and relative pose estimation, our analysis reveals persistent gaps in long-term endurance, robust obstacle avoidance in dynamic environments, and the "Sim-to-Real" transfer of reinforcement learning policies. This survey provides a roadmap for bridging these gaps, advocating for hybrid architectures that fuse lightweight classical control with data-driven perception to enable fully autonomous, agile nano-UAVs in GPS-denied environments.

Abstract (translated)

纳米级无人飞行器(nano-UAV)的自主导航受极端尺寸、重量和功耗(SWaP)限制的影响,其重量小于50克且机载处理器功率低于100毫瓦,这与标准机器人范式有根本区别。这篇综述总结了为这些低至100毫瓦计算能力设计的传感、计算及控制架构的最新进展。我们批判性地分析了从传统几何方法向新兴“边缘AI”(Edge AI)范式的转变,包括在超低功耗片上系统(SoCs)上部署量化深度神经网络以及基于事件的神经形态控制。除了算法之外,还评估了实现自主性的硬件-软件协同设计需求,涵盖了密集光流、优化的同时定位与地图构建(SLAM)和学习型飞行控制的进步。 尽管在视觉导航和相对姿态估计方面已经取得了显著进展,但我们的分析揭示了长期续航能力不足、动态环境中的鲁棒性避障以及强化学习策略的“仿真到实际”迁移等方面的持续差距。本调查提供了弥合这些差距的道路图,倡导融合轻量级经典控制与数据驱动感知的混合架构,以实现在没有全球定位系统(GPS)支持环境中完全自主且敏捷的纳米无人机飞行。

URL

https://arxiv.org/abs/2601.13252

PDF

https://arxiv.org/pdf/2601.13252.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot