Abstract
High-density object counting in surveillance scenes is challenging mainly due to the drastic variation of object scales. The prevalence of deep learning has largely boosted the object counting accuracy on several benchmark datasets. However, does the global counts really count? Armed with this question we dive into the predicted density map whose summation over the whole regions reports the global counts for more in-depth analysis. We observe that the object density map generated by most existing methods usually lacks of local consistency, i.e., counting errors in local regions exist unexpectedly even though the global count seems to well match with the ground-truth. Towards this problem, in this paper we propose a constrained multi-stage Convolutional Neural Networks (CNNs) to jointly pursue locally consistent density map from two aspects. Different from most existing methods that mainly rely on the multi-column architectures of plain CNNs, we exploit a stacking formulation of plain CNNs. Benefited from the internal multi-stage learning process, the feature map could be repeatedly refined, allowing the density map to approach the ground-truth density distribution. For further refinement of the density map, we also propose a grid loss function. With finer local-region-based supervisions, the underlying model is constrained to generate locally consistent density values to minimize the training errors considering both the global and local counts accuracy. Experiments on two widely-tested object counting benchmarks with overall significant results compared with state-of-the-art methods demonstrate the effectiveness of our approach.
Abstract (translated)
高密度目标计数在监控场景中具有挑战性,主要是由于目标尺度的剧烈变化。深度学习的普及在很大程度上提高了几个基准数据集上的对象计数精度。但是,全局计数真的很重要吗?有了这个问题,我们将深入研究预测密度图,它对整个区域的总和报告了全球统计数据,以便进行更深入的分析。我们观察到,大多数现有方法生成的目标密度图通常缺乏局部一致性,即局部区域的计数误差出乎意料地存在,尽管全局计数似乎与地面实况吻合得很好。针对这一问题,本文提出了一种约束多级卷积神经网络(CNN),从两个方面共同寻求局部一致密度图。与现有的主要依赖于平面CNN多柱结构的方法不同,我们开发了平面CNN的堆叠公式。得益于内部的多阶段学习过程,特征图可以反复进行细化,使密度图接近地面真密度分布。为了进一步完善密度图,我们还提出了一个网格损失函数。在更精细的基于局部区域的监督下,基础模型被约束生成局部一致的密度值,以最小化考虑全局和局部计数精度的训练误差。在两个广泛测试的对象计数基准上进行的实验(与最先进的方法相比具有总体显著的结果)证明了我们的方法的有效性。
URL
https://arxiv.org/abs/1904.03373