Paper Reading AI Learner

Spatial Cascaded Clustering and Weighted Memory for Unsupervised Person Re-identification

2024-03-01 03:52:29
Jiahao Hong, Jialong Zuo, Chuchu Han, Ruochen Zheng, Ming Tian, Changxin Gao, Nong Sang

Abstract

Recent unsupervised person re-identification (re-ID) methods achieve high performance by leveraging fine-grained local context. These methods are referred to as part-based methods. However, most part-based methods obtain local contexts through horizontal division, which suffer from misalignment due to various human poses. Additionally, the misalignment of semantic information in part features restricts the use of metric learning, thus affecting the effectiveness of part-based methods. The two issues mentioned above result in the under-utilization of part features in part-based methods. We introduce the Spatial Cascaded Clustering and Weighted Memory (SCWM) method to address these challenges. SCWM aims to parse and align more accurate local contexts for different human body parts while allowing the memory module to balance hard example mining and noise suppression. Specifically, we first analyze the foreground omissions and spatial confusions issues in the previous method. Then, we propose foreground and space corrections to enhance the completeness and reasonableness of the human parsing results. Next, we introduce a weighted memory and utilize two weighting strategies. These strategies address hard sample mining for global features and enhance noise resistance for part features, which enables better utilization of both global and part features. Extensive experiments on Market-1501 and MSMT17 validate the proposed method's effectiveness over many state-of-the-art methods.

Abstract (translated)

最近,无监督的人重新识别(Re-ID)方法通过利用细粒度局部上下文取得了高性能。这些方法被称为基于部分的(part-based)方法。然而,大多数基于部分的方法通过水平分割获得局部上下文,这会导致因为各种人体姿势而产生的错位。此外,部分特征中的语义信息错位限制了使用指标学习,从而影响了基于部分的方法的有效性。上述两个问题导致基于部分的方法中部分特征的利用率较低。我们引入了空间级联聚类和加权记忆(SCWM)方法来解决这些问题。SCWM旨在解析和校准不同人体部位更准确的局部上下文,同时允许记忆模块平衡难样本挖掘和噪声抑制。具体来说,我们首先分析了前方法中的前景缺失和空间混淆问题。然后,我们提出了前景和空间修正来提高人类解析结果的完整性和合理性。接下来,我们引入了加权记忆,并利用了两种加权策略。这些策略解决了全局特征的难样本挖掘问题,并提高了部分特征的噪声抵抗能力,从而更好地利用全局和部分特征。在Market-1501和MSMT17等大量实验中,我们验证了所提出方法的有效性超过了许多最先进的method。

URL

https://arxiv.org/abs/2403.00261

PDF

https://arxiv.org/pdf/2403.00261.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot