Paper Reading AI Learner

Small Target Detection for Search and Rescue Operations using Distributed Deep Learning and Synthetic Data Generation

2019-04-25 23:10:54
Kyongsik Yun, Luan Nguyen, Tuan Nguyen, Doyoung Kim, Sarah Eldin, Alexander Huyen, Thomas Lu, Edward Chow

Abstract

It is important to find the target as soon as possible for search and rescue operations. Surveillance camera systems and unmanned aerial vehicles (UAVs) are used to support search and rescue. Automatic object detection is important because a person cannot monitor multiple surveillance screens simultaneously for 24 hours. Also, the object is often too small to be recognized by the human eye on the surveillance screen. This study used UAVs around the Port of Houston and fixed surveillance cameras to build an automatic target detection system that supports the US Coast Guard (USCG) to help find targets (e.g., person overboard). We combined image segmentation, enhancement, and convolution neural networks to reduce detection time to detect small targets. We compared the performance between the auto-detection system and the human eye. Our system detected the target within 8 seconds, but the human eye detected the target within 25 seconds. Our systems also used synthetic data generation and data augmentation techniques to improve target detection accuracy. This solution may help the search and rescue operations of the first responders in a timely manner.

Abstract (translated)

在搜救行动中,尽快找到目标是很重要的。监视摄像系统和无人机(UAV)用于支持搜索和救援。自动目标检测很重要,因为一个人不能同时监视多个监视屏幕24小时。而且,这个物体通常太小,无法被监视屏幕上的人眼识别。这项研究使用休斯顿港周围的无人机和固定的监视摄像头来建立一个自动目标探测系统,支持美国海岸警卫队(USCG)帮助寻找目标(如人员落水)。我们将图像分割、增强和卷积神经网络相结合,以减少检测小目标的时间。我们比较了自动检测系统和人眼的性能。我们的系统在8秒内检测到目标,但人眼在25秒内检测到目标。我们的系统还使用合成数据生成和数据增强技术来提高目标检测精度。该方案可以帮助第一反应者及时进行搜救行动。

URL

https://arxiv.org/abs/1904.11619

PDF

https://arxiv.org/pdf/1904.11619.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot