Paper Reading AI Learner

A Compendium of Autonomous Navigation using Object Detection and Tracking in Unmanned Aerial Vehicles

2025-05-31 09:13:43
Mohit Arora, Pratyush Shukla, Shivali Chopra

Abstract

Unmanned Aerial Vehicles (UAVs) are one of the most revolutionary inventions of 21st century. At the core of a UAV lies the central processing system that uses wireless signals to control their movement. The most popular UAVs are quadcopters that use a set of four motors, arranged as two on either side with opposite spin. An autonomous UAV is called a drone. Drones have been in service in the US army since the 90's for covert missions critical to national security. It would not be wrong to claim that drones make up an integral part of the national security and provide the most valuable service during surveillance operations. While UAVs are controlled using wireless signals, there reside some challenges that disrupt the operation of such vehicles such as signal quality and range, real time processing, human expertise, robust hardware and data security. These challenges can be solved by programming UAVs to be autonomous, using object detection and tracking, through Computer Vision algorithms. Computer Vision is an interdisciplinary field that seeks the use of deep learning to gain a high-level understanding of digital images and videos for the purpose of automating the task of human visual system. Using computer vision, algorithms for detecting and tracking various objects can be developed suitable to the hardware so as to allow real time processing for immediate judgement. This paper attempts to review the various approaches several authors have proposed for the purpose of autonomous navigation of UAVs by through various algorithms of object detection and tracking in real time, for the purpose of applications in various fields such as disaster management, dense area exploration, traffic vehicle surveillance etc.

Abstract (translated)

无人驾驶飞行器(UAV,即无人机)是21世纪最具革命性的发明之一。在UAV的核心位置有一个中央处理系统,该系统使用无线信号来控制其运动。目前最受欢迎的UAV是四旋翼飞机,它由四个电机组成,两个位于一侧且旋转方向相反。自主运行的UAV被称为无人机。自90年代以来,美国军队一直在为国家安全的关键秘密任务中使用无人机服务。毫不夸张地说,无人机在国家安全中占据了重要地位,并在监视行动期间提供了最有价值的服务。尽管UAV可以通过无线信号进行控制,但诸如信号质量和范围、实时处理能力、人力专业技能、硬件的坚固性以及数据安全等挑战仍可能影响其操作性能。为了解决这些问题,可以编程使UAV自主运行,利用物体检测和跟踪技术,并通过计算机视觉算法实现。 计算机视觉是一个跨学科领域,旨在运用深度学习来理解数字图像和视频中的高层次内容,以便自动化人类视觉系统的工作任务。借助于计算机视觉,可以开发适用于特定硬件的实时处理对象检测与追踪算法,从而实现实时判断。 本文试图回顾多位作者为UAV自主导航目的通过各种实时物体检测和跟踪算法提出的多种方法,并探讨这些技术在灾害管理、密集区域探索、交通车辆监控等各个领域的应用。

URL

https://arxiv.org/abs/2506.05378

PDF

https://arxiv.org/pdf/2506.05378.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot