Paper Reading AI Learner

Cross-Entropy Adversarial View Adaptation for Person Re-identification

2019-04-03 03:52:21
Lin Wu, Richang Hong, Yang Wang, Meng Wang

Abstract

Person re-identification (re-ID) is a task of matching pedestrians under disjoint camera views. To recognise paired snapshots, it has to cope with large cross-view variations caused by the camera view shift. Supervised deep neural networks are effective in producing a set of non-linear projections that can transform cross-view images into a common feature space. However, they typically impose a symmetric architecture, yielding the network ill-conditioned on its optimisation. In this paper, we learn view-invariant subspace for person re-ID, and its corresponding similarity metric using an adversarial view adaptation approach. The main contribution is to learn coupled asymmetric mappings regarding view characteristics which are adversarially trained to address the view discrepancy by optimising the cross-entropy view confusion objective. To determine the similarity value, the network is empowered with a similarity discriminator to promote features that are highly discriminant in distinguishing positive and negative pairs. The other contribution includes an adaptive weighing on the most difficult samples to address the imbalance of within/between-identity pairs. Our approach achieves notable improved performance in comparison to state-of-the-arts on benchmark datasets.

Abstract (translated)

人的重新识别(REID)是在不相交的摄像头视图下匹配行人的任务。要识别成对的快照,它必须处理由摄像机视图移动引起的大的交叉视图变化。有监督的深部神经网络能有效地产生一组非线性投影,这些投影可以将横视图像转换为公共特征空间。然而,它们通常采用对称体系结构,使网络对其优化条件不佳。本文研究了人眼识别的视不变子空间及其相应的相似性度量,并采用了对抗性的视适应方法。主要贡献是学习有关视图特征的耦合非对称映射,这些映射经过逆向训练,通过优化交叉熵视图混淆目标来解决视图差异。为了确定相似度值,该网络具有相似度鉴别器,以提升在区分正负对时具有高度鉴别性的特征。另一个贡献包括对最困难的样本进行自适应加权,以解决标识对之间/内部的不平衡。我们的方法在基准数据集上实现了显著的改进性能。

URL

https://arxiv.org/abs/1904.01755

PDF

https://arxiv.org/pdf/1904.01755.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot