Paper Reading AI Learner

HPILN: A feature learning framework for cross-modality person re-identification

2019-06-07 14:55:47
Jian-Wu Lin, Hao Li

Abstract

Most video surveillance systems use both RGB and infrared cameras, making it a vital technique to re-identify a person cross the RGB and infrared modalities. This task can be challenging due to both the cross-modality variations caused by heterogeneous images in RGB and infrared, and the intra-modality variations caused by the heterogeneous human poses, camera views, light brightness, etc. To meet these challenges a novel feature learning framework, HPILN, is proposed. In the framework existing single-modality re-identification models are modified to fit for the cross-modality scenario, following which specifically designed hard pentaplet loss and identity loss are used to improve the performance of the modified cross-modality re-identification models. Based on the benchmark of the SYSU-MM01 dataset, extensive experiments have been conducted, which show that the proposed method outperforms all existing methods in terms of Cumulative Match Characteristic curve (CMC) and Mean Average Precision (MAP).

Abstract (translated)

大多数视频监控系统同时使用RGB和红外摄像机,这使得重新识别穿过RGB和红外模式的人成为一项至关重要的技术。由于RGB和红外图像的异质性引起的交叉模态变化,以及由于人体姿态、相机视图、光线亮度等异质性引起的模态变化,这项任务可能具有挑战性。为了应对这些挑战,提出了一种新的特征学习框架hpiln。在该框架中,对现有的单模态重新识别模型进行了修改,以适应交叉模态场景,然后采用专门设计的硬五角形损失和身份损失来提高改进的交叉模态重新识别模型的性能。以SYSU-MM01数据集为基准,进行了大量的实验研究,结果表明,该方法在累积匹配特征曲线(CMC)和平均精度(MAP)方面优于现有方法。

URL

https://arxiv.org/abs/1906.03142

PDF

https://arxiv.org/pdf/1906.03142.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot