Abstract
In the conventional person re-id setting, it is assumed that the labeled images are the person images within the bounding box for each individual; this labeling across multiple nonoverlapping camera views from raw video surveillance is costly and time-consuming. To overcome this difficulty, we consider weakly supervised person re-id modeling. The weak setting refers to matching a target person with an untrimmed gallery video where we only know that the identity appears in the video without the requirement of annotating the identity in any frame of the video during the training procedure. Hence, for a video, there could be multiple video-level labels. We cast this weakly supervised person re-id challenge into a multi-instance multi-label learning (MIML) problem. In particular, we develop a Cross-View MIML (CV-MIML) method that is able to explore potential intraclass person images from all the camera views by incorporating the intra-bag alignment and the cross-view bag alignment. Finally, the CV-MIML method is embedded into an existing deep neural network for developing the Deep Cross-View MIML (Deep CV-MIML) model. We have performed extensive experiments to show the feasibility of the proposed weakly supervised setting and verify the effectiveness of our method compared to related methods on four weakly labeled datasets.
Abstract (translated)
在传统的个人识别设置中,假设标记的图像是每个人的边界框内的个人图像;原始视频监控的多个不重叠摄像头视图之间的标记成本高且耗时。为了克服这一困难,我们考虑了弱监督的人重新ID建模。弱设置是指将目标人物与未经剪辑的画廊视频进行匹配,我们只知道该身份出现在视频中,而不需要在培训过程中在视频的任何帧中注释该身份。因此,对于视频,可能有多个视频级别标签。我们将这个弱监督的人重新识别问题转化为一个多实例多标签学习(MIML)问题。特别是,我们开发了一种交叉视场MIML(CV-MIML)方法,通过结合袋内对齐和交叉视场袋对齐,可以从所有摄像机视图中探索潜在的类内人图像。最后,将CV-MIML方法嵌入到现有的深部神经网络中,用于建立深部横观MIML模型。我们进行了大量的实验,证明了所提出的弱监督设置的可行性,并验证了该方法在四个弱标记数据集上与相关方法相比的有效性。
URL
https://arxiv.org/abs/1904.03832