Paper Reading AI Learner

Bi-directional Exponential Angular Triplet Loss for RGB-Infrared Person Re-Identification

2020-06-01 12:26:08
Hanrong Ye, Hong Liu, Fanyang Meng, Xia Li

Abstract

RGB-Infrared person re-identification (RGB-IR Re-ID) is a cross-modality matching problem with promising applications in the dark environment. Most existing works use Euclidean metric based constraints to resolve the discrepancy between features of different modalities. However, these methods are incapable of learning angularly discriminative feature embedding because Euclidean distance cannot measure the included angle between embedding vectors effectively. As an angularly discriminative feature space is important for classifying the human images based on their embedding vectors, in this paper, we propose a novel ranking loss function, named Bi-directional Exponential Angular Triplet Loss, to help learn an angularly separable common feature space by explicitly constraining the included angles between embedding vectors. Moreover, to help stabilize and learn the magnitudes of embedding vectors, we adopt a common space batch normalization layer. Quantitative experiments on the SYSU-MM01 and RegDB dataset support our analysis. On SYSU-MM01 dataset, the performance is improved from 7.40%/11.46% to 38.57%/38.61% for rank-1 accuracy/mAP compared with the baseline. The proposed method can be generalized to the task of single-modality Re-ID and improves the rank-1 accuracy/mAP from 92.0%/81.7% to 94.7%/86.6% on the Market-1501 dataset, from 82.6%/70.6% to 87.6%/77.1% on the DukeMTMC-reID dataset.

Abstract (translated)

URL

https://arxiv.org/abs/2006.00878

PDF

https://arxiv.org/pdf/2006.00878.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot