Abstract
Detecting objects in aerial images is challenging because they are typically composed of crowded small objects distributed non-uniformly over high-resolution images. Density cropping is a widely used method to improve this small object detection where the crowded small object regions are extracted and processed in high resolution. However, this is typically accomplished by adding other learnable components, thus complicating the training and inference over a standard detection process. In this paper, we propose an efficient Cascaded Zoom-in (CZ) detector that re-purposes the detector itself for density-guided training and inference. During training, density crops are located, labeled as a new class, and employed to augment the training dataset. During inference, the density crops are first detected along with the base class objects, and then input for a second stage of inference. This approach is easily integrated into any detector, and creates no significant change in the standard detection process, like the uniform cropping approach popular in aerial image detection. Experimental results on the aerial images of the challenging VisDrone and DOTA datasets verify the benefits of the proposed approach. The proposed CZ detector also provides state-of-the-art results over uniform cropping and other density cropping methods on the VisDrone dataset, increasing the detection mAP of small objects by more than 3 points.
Abstract (translated)
检测空中图像中的物体是一项挑战性的任务,因为它们通常由拥挤小型物体在高分辨率图像中的非均匀分布组成。密度裁剪是一种广泛使用的方法,用于改善拥挤小型物体检测,其中将拥挤的小型物体区域提取并处理在高分辨率图像中。然而,通常通过添加其他可学习的成分来实现,因此增加了标准的检测学习和推断过程的复杂性。在本文中,我们提出了一种高效的Cascaded Zoom-in(CZ)探测器,将探测器本身用于密度引导的训练和推断。在训练期间,密度裁剪被找到并标记为一个新类,用于增加训练数据集。在推断期间,密度裁剪首先与基类对象一起检测到,然后输入到第二个推断阶段。这种方法可以轻松地与任何探测器集成,并在标准检测过程中不会造成任何显著变化,就像空中图像检测中流行的均匀裁剪方法一样。对挑战性的VisDrone和DOTA数据集的空中图像的实验结果验证了我们提出的方法的好处。我们提出的CZ探测器在VisDrone数据集中提供了与均匀裁剪和其他密度裁剪方法相比最先进的结果,增加了小型物体的检测mAP超过3点。
URL
https://arxiv.org/abs/2303.08747