Paper Reading AI Learner

Attributes Guided Feature Learning for Vehicle Re-identification

2019-05-22 07:42:02
Aihua Zheng, Xianmin Lin, Chenglong Li, Ran He, Jin Tang

Abstract

Vehicle Re-ID has recently attracted enthusiastic attention due to its potential applications in smart city and urban surveillance. However, it suffers from large intra-class variation caused by view variations and illumination changes, and inter-class similarity especially for different identities with the similar appearance. To handle these issues, in this paper, we propose a novel deep network architecture, which guided by meaningful attributes including camera views, vehicle types and colors for vehicle Re-ID. In particular, our network is end-to-end trained and contains three subnetworks of deep features embedded by the corresponding attributes (i.e., camera view, vehicle type and vehicle color). Moreover, to overcome the shortcomings of limited vehicle images of different views, we design a view-specified generative adversarial network to generate the multi-view vehicle images. For network training, we annotate the view labels on the VeRi-776 dataset. Note that one can directly adopt the pre-trained view (as well as type and color) subnetwork on the other datasets with only ID information, which demonstrates the generalization of our model. Extensive experiments on the benchmark datasets VeRi-776 and VehicleID suggest that the proposed approach achieves the promising performance and yields to a new state-of-the-art for vehicle Re-ID.

Abstract (translated)

由于其在智能城市和城市监控领域的潜在应用,车辆识别技术近年来受到了广泛关注。然而,由于视场的变化和光照的变化,它存在着较大的类内差异,特别是对于具有相似外观的不同身份,它也存在着类间相似性。为了解决这些问题,本文提出了一种新的深度网络体系结构,该体系结构以有意义的属性为指导,包括摄像机视图、车辆类型和车辆识别的颜色,特别是我们的网络是端到端的训练,包含三个子网络的深度特征嵌入相应的属性(即摄像机视图,车辆类型和车辆颜色)。此外,为了克服不同视点车辆图像有限的缺点,我们设计了一个视点特定的生成对抗网络来生成多视点车辆图像。对于网络培训,我们注释了Veri-776数据集上的视图标签。注意,我们可以直接在其他只有ID信息的数据集上采用预先训练的视图(以及类型和颜色)子网络,这表明我们的模型是通用的。在基准数据集Veri-776和VehicleID上进行的大量实验表明,所提出的方法取得了良好的性能,并产生了一种新的最先进的车辆识别技术。

URL

https://arxiv.org/abs/1905.08997

PDF

https://arxiv.org/pdf/1905.08997.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot