Paper Reading AI Learner

UAVDB: Trajectory-Guided Adaptable Bounding Boxes for UAV Detection

2024-09-09 13:27:53
Yu-Hsi Chen

Abstract

With the rapid development of drone technology, accurate detection of Unmanned Aerial Vehicles (UAVs) has become essential for applications such as surveillance, security, and airspace management. In this paper, we propose a novel trajectory-guided method, the Patch Intensity Convergence (PIC) technique, which generates high-fidelity bounding boxes for UAV detection tasks and no need for the effort required for labeling. The PIC technique forms the foundation for developing UAVDB, a database explicitly created for UAV detection. Unlike existing datasets, which often use low-resolution footage or focus on UAVs in simple backgrounds, UAVDB employs high-resolution video to capture UAVs at various scales, ranging from hundreds of pixels to nearly single-digit sizes. This broad-scale variation enables comprehensive evaluation of detection algorithms across different UAV sizes and distances. Applying the PIC technique, we can also efficiently generate detection datasets from trajectory or positional data, even without size information. We extensively benchmark UAVDB using YOLOv8 series detectors, offering a detailed performance analysis. Our findings highlight UAVDB's potential as a vital database for advancing UAV detection, particularly in high-resolution and long-distance tracking scenarios.

Abstract (translated)

随着无人机技术的快速发展,准确检测无人机(UAVs)已成为诸如监视、安全和空域管理等应用中不可或缺的关键。在本文中,我们提出了一个新的轨迹引导方法,即Patch Intensity Convergence(PIC)技术,为无人机检测任务生成高保真度的边界框,无需进行标签。PIC技术为开发UAVDB奠定了基础,这是一种专门为无人机检测创建的数据库。与现有数据集不同,它们通常使用低分辨率视频或关注于简单的背景中的无人机。UAVDB采用高分辨率视频捕捉各种大小的无人机,从几百个像素到几乎单数字的大小。这种大范围的变化使得可以在不同UAV大小和距离上全面评估检测算法的性能。此外,我们还可以从轨迹或位置数据 efficiently生成检测数据集,即使没有尺寸信息。我们通过使用YOLOv8系列检测器对UAVDB进行了广泛基准测试,提供了详细的表现分析。我们的研究结果突出了UAVDB在促进UAV检测方面的潜力,特别是在高分辨率和远距离跟踪场景中。

URL

https://arxiv.org/abs/2409.06490

PDF

https://arxiv.org/pdf/2409.06490.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot