Abstract
Object tracking is divided into single-object tracking (SOT) and multi-object tracking (MOT). MOT aims to maintain the identities of multiple objects across a series of continuous video sequences. In recent years, MOT has made rapid progress. However, modeling the motion and appearance models of objects in complex scenes still faces various challenging issues. In this paper, we design a novel direction consistency method for smooth trajectory prediction (STP-DC) to increase the modeling of motion information and overcome the lack of robustness in previous methods in complex scenes. Existing methods use pedestrian re-identification (Re-ID) to model appearance, however, they extract more background information which lacks discriminability in occlusion and crowded scenes. We propose a hyper-grain feature embedding network (HG-FEN) to enhance the modeling of appearance models, thus generating robust appearance descriptors. We also proposed other robustness techniques, including CF-ECM for storing robust appearance information and SK-AS for improving association accuracy. To achieve state-of-the-art performance in MOT, we propose a robust tracker named Rt-track, incorporating various tricks and techniques. It achieves 79.5 MOTA, 76.0 IDF1 and 62.1 HOTA on the test set of MOT17.Rt-track also achieves 77.9 MOTA, 78.4 IDF1 and 63.3 HOTA on MOT20, surpassing all published methods.
Abstract (translated)
对象跟踪可以分为单对象跟踪(SOT)和多对象跟踪(MOT)。MOT的目标是在一系列连续视频序列中维持多个物体的身份。近年来,MOT取得了迅速进展。然而,在复杂场景中建模物体的运动和外观模型仍然面临各种挑战。在本文中,我们设计了一种平滑路径预测的新方向一致性方法(STP-DC),以提高运动信息的建模能力,并克服在复杂场景中之前方法的缺乏可靠性。现有方法使用人名识别(Re-ID)来建模外观,但是它们提取更多的背景信息,在遮挡和拥挤场景中缺乏分辨性。我们提出了一种超颗粒特征嵌入网络(HG-FEN)来增强外观模型的建模能力,从而生成可靠的外观描述符。我们还提出了其他可靠性技术,包括存储可靠的外观信息的实验方法CF-ECM和提高关联准确性的SK-AS。为了在MOT中实现最先进的性能,我们提出了名为Rt-track的可靠跟踪器,综合各种技巧和方法。它在MOT17测试集上实现了79.5 MOTA、76.0 IDF1和62.1 HOTA。Rt-track还在MOT20上实现了77.9 MOTA、78.4 IDF1和63.3 HOTA,超越了所有公开方法。
URL
https://arxiv.org/abs/2303.09668