Person Search in Videos with One Portrait Through Visual and Temporal Links

Abstract
Abstract (translated)
URL
PDF

Abstract

In real-world applications, e.g. law enforcement and video retrieval, one often needs to search a certain person in long videos with just one portrait. This is much more challenging than the conventional settings for person re-identification, as the search may need to be carried out in the environments different from where the portrait was taken. In this paper, we aim to tackle this challenge and propose a novel framework, which takes into account the identity invariance along a tracklet, thus allowing person identities to be propagated via both the visual and the temporal links. We also develop a novel scheme called Progressive Propagation via Competitive Consensus, which significantly improves the reliability of the propagation process. To promote the study of person search, we construct a large-scale benchmark, which contains 127K manually annotated tracklets from 192 movies. Experiments show that our approach remarkably outperforms mainstream person re-id methods, raising the mAP from 42.16% to 62.27%.

Abstract (translated)

在实际应用中，例如执法和视频检索，人们经常需要只用一个肖像在长视频中搜索某个人。这比人们重新识别的传统设置更具挑战性，因为搜索可能需要在与拍摄肖像的地方不同的环境中进行。在本文中，我们的目标是解决这一挑战并提出一个新的框架，该框架考虑了沿轨迹的身份不变性，从而允许通过视觉和时间链接传播人物身份。我们还开发了一种名为“通过竞争共识实现渐进式传播”的新方案，该方案显着提高了传播过程的可靠性。为了促进人物搜索的研究，我们构建了一个大型基准测试，其中包含来自192部电影的127K手动注释轨迹。实验表明，我们的方法明显优于主流人员re-id方法，将mAP从42.16％提高到62.27％。

URL

https://arxiv.org/abs/1807.10510

PDF

https://arxiv.org/pdf/1807.10510.pdf