Abstract
Detecting objects from aerial images poses significant challenges due to the following factors: 1) Aerial images typically have very large sizes, generally with millions or even hundreds of millions of pixels, while computational resources are limited. 2) Small object size leads to insufficient information for effective detection. 3) Non-uniform object distribution leads to computational resource wastage. To address these issues, we propose YOLC (You Only Look Clusters), an efficient and effective framework that builds on an anchor-free object detector, CenterNet. To overcome the challenges posed by large-scale images and non-uniform object distribution, we introduce a Local Scale Module (LSM) that adaptively searches cluster regions for zooming in for accurate detection. Additionally, we modify the regression loss using Gaussian Wasserstein distance (GWD) to obtain high-quality bounding boxes. Deformable convolution and refinement methods are employed in the detection head to enhance the detection of small objects. We perform extensive experiments on two aerial image datasets, including Visdrone2019 and UAVDT, to demonstrate the effectiveness and superiority of our proposed approach.
Abstract (translated)
从无人机图像中检测物体带来了相当大的挑战,因为以下原因:1)无人机图像通常具有非常大的尺寸,通常是数百万或甚至数千万像素,而计算资源有限。2)小物体尺寸导致有效检测信息不足。3)非均匀物体分布导致计算资源浪费。为解决这些问题,我们提出了YOLC(你只看聚类)框架,这是一个基于无锚定物体检测器,基于CenterNet的,有效且高效的框架。为了克服大规模图像和非均匀物体分布带来的挑战,我们引入了局部尺度模块(LSM),它动态地搜索聚类区域以进行精确检测。此外,我们还使用高斯瓦瑟夫距离(GWD)修改回归损失以获得高质量的边界框。在检测头部采用可变形卷积和优化方法来增强对小物体的检测。我们对两个无人机图像数据集(包括Visdrone2019和UAVDT)进行了广泛的实验,以证明我们提出方法的有效性和优越性。
URL
https://arxiv.org/abs/2404.06180