Paper Reading AI Learner

DroNet: Efficient convolutional neural network detector for real-time UAV applications

2018-07-18 06:30:54
Christos Kyrkou, George Plastiras, Stylianos Venieris, Theocharis Theocharides, Christos-Savvas Bouganis

Abstract

Unmanned Aerial Vehicles (drones) are emerging as a promising technology for both environmental and infrastructure monitoring, with broad use in a plethora of applications. Many such applications require the use of computer vision algorithms in order to analyse the information captured from an on-board camera. Such applications include detecting vehicles for emergency response and traffic monitoring. This paper therefore, explores the trade-offs involved in the development of a single-shot object detector based on deep convolutional neural networks (CNNs) that can enable UAVs to perform vehicle detection under a resource constrained environment such as in a UAV. The paper presents a holistic approach for designing such systems; the data collection and training stages, the CNN architecture, and the optimizations necessary to efficiently map such a CNN on a lightweight embedded processing platform suitable for deployment on UAVs. Through the analysis we propose a CNN architecture that is capable of detecting vehicles from aerial UAV images and can operate between 5-18 frames-per-second for a variety of platforms with an overall accuracy of ~95%. Overall, the proposed architecture is suitable for UAV applications, utilizing low-power embedded processors that can be deployed on commercial UAVs.

Abstract (translated)

无人驾驶飞行器(无人机)正在成为环境和基础设施监测的一种有前景的技术,广泛应用于众多应用领域。许多这样的应用程序需要使用计算机视觉算法来分析从机载相机捕获的信息。这些应用包括检测用于紧急响应和交通监控的车辆。因此,本文探讨了基于深度卷积神经网络(CNN)开发基于深度卷积神经网络(CNN)的单发物体探测器所涉及的权衡,该探测器能够使UAV在资源受限环境(例如无人机)中执行车辆检测。本文介绍了设计此类系统的整体方法;数据收集和训练阶段,CNN架构,以及在适合部署在无人机上的轻量级嵌入式处理平台上有效地映射CNN所需的优化。通过分析,我们提出了一种CNN架构,能够从航空无人机图像中检测车辆,并且可以在各种平台上以每秒5-18帧的速度运行,总体精度为~95%。总的来说,所提出的架构适用于无人机应用,利用可以部署在商用无人机上的低功率嵌入式处理器。

URL

https://arxiv.org/abs/1807.06789

PDF

https://arxiv.org/pdf/1807.06789.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot