Paper Reading AI Learner

Fast Region of Interest Proposals on Maritime UAVs

2023-01-27 10:58:10
Benjamin Kiefer, Andreas Zell

Abstract

Unmanned aerial vehicles assist in maritime search and rescue missions by flying over large search areas to autonomously search for objects or people. Reliably detecting objects of interest requires fast models to employ on embedded hardware. Moreover, with increasing distance to the ground station only part of the video data can be transmitted. In this work, we consider the problem of finding meaningful region of interest proposals in a video stream on an embedded GPU. Current object or anomaly detectors are not suitable due to their slow speed, especially on limited hardware and for large image resolutions. Lastly, objects of interest, such as pieces of wreckage, are often not known a priori. Therefore, we propose an end-to-end future frame prediction model running in real-time on embedded GPUs to generate region proposals. We analyze its performance on large-scale maritime data sets and demonstrate its benefits over traditional and modern methods.

Abstract (translated)

无人飞行器协助海上搜救任务,自主地在大型搜索区域中搜索物体或人员。可靠地检测感兴趣的物体需要嵌入硬件使用的快速模型。此外,与地面站的距离增加,只有一部分视频数据可以传输。在这项工作中,我们考虑在嵌入GPU的视频中查找有意义的区域建议的问题。当前的物体或异常检测器不适合,因为它们的速度较慢,特别是在硬件限制和高图像分辨率的情况下。最后,感兴趣的物体,如残骸碎片,通常不是预先知道的。因此,我们提出了一种在嵌入GPU的视频中实时运行的终末帧预测模型,以生成区域建议。我们分析了它在大规模海上数据集上的性能,并证明了它与传统与现代方法的优势。

URL

https://arxiv.org/abs/2301.11650

PDF

https://arxiv.org/pdf/2301.11650.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot