Paper Reading AI Learner

Unsupervised Domain Adaptation for Multispectral Pedestrian Detection

2019-04-07 17:24:28
Dayan Guan, Xing Luo, Yanpeng Cao, Jiangxin Yang, Yanlong Cao, George Vosselman, Michael Ying Yang

Abstract

Multimodal information (e.g., visible and thermal) can generate robust pedestrian detections to facilitate around-the-clock computer vision applications, such as autonomous driving and video surveillance. However, it still remains a crucial challenge to train a reliable detector working well in different multispectral pedestrian datasets without manual annotations. In this paper, we propose a novel unsupervised domain adaptation framework for multispectral pedestrian detection, by iteratively generating pseudo annotations and updating the parameters of our designed multispectral pedestrian detector on target domain. Pseudo annotations are generated using the detector trained on source domain, and then updated by fixing the parameters of detector and minimizing the cross entropy loss without back-propagation. Training labels are generated using the pseudo annotations by considering the characteristics of similarity and complementarity between well-aligned visible and infrared image pairs. The parameters of detector are updated using the generated labels by minimizing our defined multi-detection loss function with back-propagation. The optimal parameters of detector can be obtained after iteratively updating the pseudo annotations and parameters. Experimental results show that our proposed unsupervised multimodal domain adaptation method achieves significantly higher detection performance than the approach without domain adaptation, and is competitive with the supervised multispectral pedestrian detectors.

Abstract (translated)

多模式信息(例如,可见光和热)可以产生强大的行人检测,以方便全天候计算机视觉应用,如自动驾驶和视频监控。然而,在不同的多光谱行人数据集中训练一个可靠的探测器,在没有人工注释的情况下工作良好,仍然是一个至关重要的挑战。本文提出了一种新的多光谱行人检测的无监督域自适应框架,该框架通过迭代生成伪注释并更新我们设计的目标域多光谱行人检测器的参数。利用在源域上训练的检测器生成伪注释,然后通过固定检测器的参数,在不反向传播的情况下最小化交叉熵损失来更新伪注释。训练标签是利用伪注释生成的,它考虑了良好对齐的可见光和红外图像对之间的相似性和互补性。通过利用反向传播最小化我们定义的多重检测损失函数,利用生成的标签更新检测器参数。通过对伪注释和伪参数的迭代更新,可以得到检测器的最优参数。实验结果表明,我们提出的无监督多模态域自适应方法比无域自适应方法具有更高的检测性能,并且与有监督多光谱行人检测器具有竞争性。

URL

https://arxiv.org/abs/1904.03692

PDF

https://arxiv.org/pdf/1904.03692.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot