Abstract
Visible-infrared person re-identification (VI-ReID) aims to retrieve images of the same pedestrian from different modalities, where the challenges lie in the significant modality discrepancy. To alleviate the modality gap, recent methods generate intermediate images by GANs, grayscaling, or mixup strategies. However, these methods could ntroduce extra noise, and the semantic correspondence between the two modalities is not well learned. In this paper, we propose a Patch-Mixed Cross-Modality framework (PMCM), where two images of the same person from two modalities are split into patches and stitched into a new one for model learning. In this way, the modellearns to recognize a person through patches of different styles, and the modality semantic correspondence is directly embodied. With the flexible image generation strategy, the patch-mixed images freely adjust the ratio of different modality patches, which could further alleviate the modality imbalance problem. In addition, the relationship between identity centers among modalities is explored to further reduce the modality variance, and the global-to-part constraint is introduced to regularize representation learning of part features. On two VI-ReID datasets, we report new state-of-the-art performance with the proposed method.
Abstract (translated)
visible-infrared person re-identification (VI-ReID)的目标是从不同模态中提取相同的行人图像,而模态差异的问题是挑战所在。为了缓解模态差异,最近的方法使用GAN、灰度化或混合策略生成中间图像。然而,这些方法可能会导致额外的噪声,并且两个模态之间的语义对应关系并未很好学习。在本文中,我们提出了块混合跨模态框架(PMCM),其中将同一个人的两个模态的图像分割成块并拼接成一个新的图像,以模型学习为目的。这样,模型通过学习不同模态块的不同风格来识别一个人,模态语义对应关系直接体现在图像块中。通过灵活的图像生成策略,块混合图像可以自由调整不同模态块的比例,这可以进一步减轻模态不平衡问题。此外,探索模态之间身份中心的关系,进一步减少模态差异方差,并引入全局到部分约束,以 regularize 部分特征表示学习。在两个VI-ReID数据集上,我们使用提出的方法报告了新的最高水平。
URL
https://arxiv.org/abs/2302.08212