Clustered Object Detection in Aerial Images

Abstract
Abstract (translated)
URL
PDF

Abstract

Detecting objects in aerial images is challenging for at least two reasons: (1) target objects like pedestrians are very small in terms of pixels, making them hard to be distinguished from surrounding background; and (2) targets are in general very sparsely and nonuniformly distributed, making the detection very inefficient. In this paper we address both issues inspired by the observation that these targets are often clustered. In particular, we propose a Clustered Detection (ClusDet) network that unifies object cluster and detection in an end-to-end framework. The key components in ClusDet include a cluster proposal sub-network (CPNet), a scale estimation sub-network (ScaleNet), and a dedicated detection network (DetecNet). Given an input image, CPNet produces (object) cluster regions and ScaleNet estimates object scales for these regions. Then, each scale-normalized cluster region and their features are fed into DetecNet for object detection. Compared with previous solutions, ClusDet has several advantages: (1) it greatly reduces the number of blocks for final object detection and hence achieves high running time efficiency, (2) the cluster-based scale estimation is more accurate than previously used single-object based ones, hence effectively improves the detection for small objects, and (3) the final DetecNet is dedicated for clustered regions and implicitly models the prior context information so as to boost detection accuracy. The proposed method is tested on three representative aerial image datasets including VisDrone, UAVDT and DOTA. In all the experiments, ClusDet achieves promising performance in both efficiency and accuracy, in comparison with state-of-the-art detectors.

Abstract (translated)

在航空影像中探测物体具有挑战性，至少有两个原因：（1）目标物体，如行人，像素非常小，难以与周围背景区分；（2）目标总体上非常稀疏和不均匀分布，使得探测效率非常低。在本文中，我们讨论了这两个问题，这两个问题都是由观察到的，即这些目标通常是聚集的。特别地，我们提出了一个集群检测（clusdet）网络，它在端到端框架中统一了对象集群和检测。clusdet的关键组成部分包括集群方案子网（cpnet）、尺度估计子网（scalenet）和专用检测网（detecnet）。给定一个输入图像，cpnet生成（对象）集群区域，scalenet估计这些区域的对象比例。然后将各尺度归一化聚类区域及其特征输入检测网进行目标检测。与以前的解决方案相比，clusdet具有以下优点：（1）大大减少了最终目标检测的块数，从而达到了较高的运行时间效率；（2）基于簇的尺度估计比以前使用的基于单目标的尺度估计更准确，从而有效地提高了对小目标的检测。（3）最后一个检测集专门用于聚类区域，并隐式地对先前的上下文信息进行建模，以提高检测精度。该方法在三个具有代表性的航空图像数据集上进行了测试，包括无人机、无人机和多塔。在所有的实验中，与最先进的探测器相比，clusdet在效率和准确性方面都取得了很好的性能。

URL

https://arxiv.org/abs/1904.08008

PDF

https://arxiv.org/pdf/1904.08008.pdf