Paper Reading AI Learner

Cascaded Zoom-in Detector for High Resolution Aerial Images

2023-03-15 16:39:21
Akhil Meethal, Eric Granger, Marco Pedersoli

Abstract

Detecting objects in aerial images is challenging because they are typically composed of crowded small objects distributed non-uniformly over high-resolution images. Density cropping is a widely used method to improve this small object detection where the crowded small object regions are extracted and processed in high resolution. However, this is typically accomplished by adding other learnable components, thus complicating the training and inference over a standard detection process. In this paper, we propose an efficient Cascaded Zoom-in (CZ) detector that re-purposes the detector itself for density-guided training and inference. During training, density crops are located, labeled as a new class, and employed to augment the training dataset. During inference, the density crops are first detected along with the base class objects, and then input for a second stage of inference. This approach is easily integrated into any detector, and creates no significant change in the standard detection process, like the uniform cropping approach popular in aerial image detection. Experimental results on the aerial images of the challenging VisDrone and DOTA datasets verify the benefits of the proposed approach. The proposed CZ detector also provides state-of-the-art results over uniform cropping and other density cropping methods on the VisDrone dataset, increasing the detection mAP of small objects by more than 3 points.

Abstract (translated)

检测空中图像中的物体是一项挑战性的任务,因为它们通常由拥挤小型物体在高分辨率图像中的非均匀分布组成。密度裁剪是一种广泛使用的方法,用于改善拥挤小型物体检测,其中将拥挤的小型物体区域提取并处理在高分辨率图像中。然而,通常通过添加其他可学习的成分来实现,因此增加了标准的检测学习和推断过程的复杂性。在本文中,我们提出了一种高效的Cascaded Zoom-in(CZ)探测器,将探测器本身用于密度引导的训练和推断。在训练期间,密度裁剪被找到并标记为一个新类,用于增加训练数据集。在推断期间,密度裁剪首先与基类对象一起检测到,然后输入到第二个推断阶段。这种方法可以轻松地与任何探测器集成,并在标准检测过程中不会造成任何显著变化,就像空中图像检测中流行的均匀裁剪方法一样。对挑战性的VisDrone和DOTA数据集的空中图像的实验结果验证了我们提出的方法的好处。我们提出的CZ探测器在VisDrone数据集中提供了与均匀裁剪和其他密度裁剪方法相比最先进的结果,增加了小型物体的检测mAP超过3点。

URL

https://arxiv.org/abs/2303.08747

PDF

https://arxiv.org/pdf/2303.08747.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot