Paper Reading AI Learner

Unsupervised Visible-Infrared ReID via Pseudo-label Correction and Modality-level Alignment

2024-04-10 02:03:14
Yexin Liu, Weiming Zhang, Athanasios V. Vasilakos, Lin Wang

Abstract

Unsupervised visible-infrared person re-identification (UVI-ReID) has recently gained great attention due to its potential for enhancing human detection in diverse environments without labeling. Previous methods utilize intra-modality clustering and cross-modality feature matching to achieve UVI-ReID. However, there exist two challenges: 1) noisy pseudo labels might be generated in the clustering process, and 2) the cross-modality feature alignment via matching the marginal distribution of visible and infrared modalities may misalign the different identities from two modalities. In this paper, we first conduct a theoretic analysis where an interpretable generalization upper bound is introduced. Based on the analysis, we then propose a novel unsupervised cross-modality person re-identification framework (PRAISE). Specifically, to address the first challenge, we propose a pseudo-label correction strategy that utilizes a Beta Mixture Model to predict the probability of mis-clustering based network's memory effect and rectifies the correspondence by adding a perceptual term to contrastive learning. Next, we introduce a modality-level alignment strategy that generates paired visible-infrared latent features and reduces the modality gap by aligning the labeling function of visible and infrared features to learn identity discriminative and modality-invariant features. Experimental results on two benchmark datasets demonstrate that our method achieves state-of-the-art performance than the unsupervised visible-ReID methods.

Abstract (translated)

无监督可见-红外人员识别(UVI-ReID)最近因其在不同环境中增强人类检测潜力而受到广泛关注,而无需标签。以前的方法利用内部模态聚类和跨模态特征匹配来实现UVI-ReID。然而,存在两个挑战:1)在聚类过程中可能生成噪声伪标签,2)通过匹配可见和红外模态的边缘分布进行跨模态特征对齐可能错位不同个体的身份。在本文中,我们首先进行理论分析引入了可解释的泛化上界。基于分析,我们 then 提出了一个新颖的无监督跨模态人员识别框架(PRAISE)。具体来说,为解决第一个挑战,我们提出了一个伪标签修正策略,利用贝叶斯混合模型预测网络记忆效应并纠正错配,通过添加感知项来 contrastive 学习。接下来,我们引入了一个模块级别对齐策略,生成对齐的可视-红外潜在特征,并通过对可见和红外特征的标签函数进行对齐来降低模态差距,以学习具有身份鉴别和模态无关特征的识别。在两个基准数据集上的实验结果表明,与其他无监督可见-ReID 方法相比,我们的方法实现了最先进的性能。

URL

https://arxiv.org/abs/2404.06683

PDF

https://arxiv.org/pdf/2404.06683.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot