Abstract
Person re-identification (re-ID) is a task of matching pedestrians under disjoint camera views. To recognise paired snapshots, it has to cope with large cross-view variations caused by the camera view shift. Supervised deep neural networks are effective in producing a set of non-linear projections that can transform cross-view images into a common feature space. However, they typically impose a symmetric architecture, yielding the network ill-conditioned on its optimisation. In this paper, we learn view-invariant subspace for person re-ID, and its corresponding similarity metric using an adversarial view adaptation approach. The main contribution is to learn coupled asymmetric mappings regarding view characteristics which are adversarially trained to address the view discrepancy by optimising the cross-entropy view confusion objective. To determine the similarity value, the network is empowered with a similarity discriminator to promote features that are highly discriminant in distinguishing positive and negative pairs. The other contribution includes an adaptive weighing on the most difficult samples to address the imbalance of within/between-identity pairs. Our approach achieves notable improved performance in comparison to state-of-the-arts on benchmark datasets.
Abstract (translated)
人的重新识别(REID)是在不相交的摄像头视图下匹配行人的任务。要识别成对的快照,它必须处理由摄像机视图移动引起的大的交叉视图变化。有监督的深部神经网络能有效地产生一组非线性投影,这些投影可以将横视图像转换为公共特征空间。然而,它们通常采用对称体系结构,使网络对其优化条件不佳。本文研究了人眼识别的视不变子空间及其相应的相似性度量,并采用了对抗性的视适应方法。主要贡献是学习有关视图特征的耦合非对称映射,这些映射经过逆向训练,通过优化交叉熵视图混淆目标来解决视图差异。为了确定相似度值,该网络具有相似度鉴别器,以提升在区分正负对时具有高度鉴别性的特征。另一个贡献包括对最困难的样本进行自适应加权,以解决标识对之间/内部的不平衡。我们的方法在基准数据集上实现了显著的改进性能。
URL
https://arxiv.org/abs/1904.01755