DensSiam: End-to-End Densely-Siamese Network with Self-Attention Model for Object Tracking

Abstract
Abstract (translated)
URL
PDF

Abstract

Convolutional Siamese neural networks have been recently used to track objects using deep features. Siamese architecture can achieve real time speed, however it is still difficult to find a Siamese architecture that maintains the generalization capability, high accuracy and speed while decreasing the number of shared parameters especially when it is very deep. Furthermore, a conventional Siamese architecture usually processes one local neighborhood at a time, which makes the appearance model local and non-robust to appearance changes. To overcome these two problems, this paper proposes DensSiam, a novel convolutional Siamese architecture, which uses the concept of dense layers and connects each dense layer to all layers in a feed-forward fashion with a similarity-learning function. DensSiam also includes a Self-Attention mechanism to force the network to pay more attention to the non-local features during offline training. Extensive experiments are performed on four tracking benchmarks: OTB2013 and OTB2015 for validation set; and VOT2015, VOT2016 and VOT2017 for testing set. The obtained results show that DensSiam achieves superior results on these benchmarks compared to other current state-of-the-art methods.

Abstract (translated)

卷积连体神经网络最近已被用于使用深度特征来跟踪对象。连体结构可以实现实时速度，但是仍然很难找到保持泛化能力，高精度和速度的连体结构，同时减少共享参数的数量，特别是当它非常深时。此外，传统的Siamese架构通常一次处理一个局部邻域，这使得外观模型对于外观变化是局部的和非鲁棒的。为了克服这两个问题，本文提出了DensSiam，一种新颖的卷积连体结构，它使用密集层的概念，并以前馈方式将每个密集层连接到所有层，具有相似性学习功能。 DensSiam还包括一个自我注意机制，迫使网络在离线培训期间更加关注非本地功能。在四个跟踪基准上进行了大量实验：用于验证集的OTB2013和OTB2015;和VOT2015，VOT2016和VOT2017用于测试集。得到的结果表明，与其他现有技术相比，DensSiam在这些基准测试中取得了优异的成果。

URL

https://arxiv.org/abs/1809.02714

PDF

https://arxiv.org/pdf/1809.02714.pdf