Paper Reading AI Learner

Enhancing Long-Term Person Re-Identification Using Global, Local Body Part, and Head Streams

2024-03-05 11:57:10
Duy Tran Thanh, Yeejin Lee, Byeongkeun Kang

Abstract

This work addresses the task of long-term person re-identification. Typically, person re-identification assumes that people do not change their clothes, which limits its applications to short-term scenarios. To overcome this limitation, we investigate long-term person re-identification, which considers both clothes-changing and clothes-consistent scenarios. In this paper, we propose a novel framework that effectively learns and utilizes both global and local information. The proposed framework consists of three streams: global, local body part, and head streams. The global and head streams encode identity-relevant information from an entire image and a cropped image of the head region, respectively. Both streams encode the most distinct, less distinct, and average features using the combinations of adversarial erasing, max pooling, and average pooling. The local body part stream extracts identity-related information for each body part, allowing it to be compared with the same body part from another image. Since body part annotations are not available in re-identification datasets, pseudo-labels are generated using clustering. These labels are then utilized to train a body part segmentation head in the local body part stream. The proposed framework is trained by backpropagating the weighted summation of the identity classification loss, the pair-based loss, and the pseudo body part segmentation loss. To demonstrate the effectiveness of the proposed method, we conducted experiments on three publicly available datasets (Celeb-reID, PRCC, and VC-Clothes). The experimental results demonstrate that the proposed method outperforms the previous state-of-the-art method.

Abstract (translated)

本文解决了长期人物识别(long-term person re-identification)的任务。通常,人物识别假设人们不改变衣服,这限制了其应用于短期场景。为了克服这一限制,我们研究了长期人物识别,考虑了换衣服和换衣服一致的情况。在本文中,我们提出了一个新颖的框架,有效地学习和利用了全局和局部信息。该框架包括三个流:全局流、局部身体部分流和头流。全局和头流分别从整个图像和头部裁剪图像中编码身份相关信息。这两条流使用组合的对抗性消除、最大池化和平均池化来编码最明显的、不太明显的和平均的特征。局部身体部分流提取与每个身体部分相关的身份信息,使得它可以与另一个图像中的相同身体部分进行比较。由于身份标注在识别数据集中不存在,因此通过聚类生成伪标签。这些伪标签随后用于在局部身体部分流中训练身体部分分割头。所提出的框架通过反向传播全局身份分类损失、基于一对的损失和伪身体部分分割损失的加权求和进行训练。为了证明所提出方法的有效性,我们在三个公开可用数据集(Celeb-reID、PRCC和VC-Clothes)上进行了实验。实验结果表明,与以前的最先进方法相比,所提出的方法具有优越性能。

URL

https://arxiv.org/abs/2403.02892

PDF

https://arxiv.org/pdf/2403.02892.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot