Paper Reading AI Learner

Leveraging Motion Priors in Videos for Improving Human Segmentation

2018-07-30 16:52:04
Yu-Ting Chen, Wen-Yen Chang, Hai-Lun Lu, Tingfan Wu, Min Sun

Abstract

Despite many advances in deep-learning based semantic segmentation, performance drop due to distribution mismatch is often encountered in the real world. Recently, a few domain adaptation and active learning approaches have been proposed to mitigate the performance drop. However, very little attention has been made toward leveraging information in videos which are naturally captured in most camera systems. In this work, we propose to leverage "motion prior" in videos for improving human segmentation in a weakly-supervised active learning setting. By extracting motion information using optical flow in videos, we can extract candidate foreground motion segments (referred to as motion prior) potentially corresponding to human segments. We propose to learn a memory-network-based policy model to select strong candidate segments (referred to as strong motion prior) through reinforcement learning. The selected segments have high precision and are directly used to finetune the model. In a newly collected surveillance camera dataset and a publicly available UrbanStreet dataset, our proposed method improves the performance of human segmentation across multiple scenes and modalities (i.e., RGB to Infrared (IR)). Last but not least, our method is empirically complementary to existing domain adaptation approaches such that additional performance gain is achieved by combining our weakly-supervised active learning approach with domain adaptation approaches.

Abstract (translated)

尽管基于深度学习的语义分割有许多进步,但由于分布不匹配导致的性能下降经常在现实世界中遇到。最近,已经提出了一些域适应和主动学习方法来减轻性能下降。然而,很少关注利用在大多数相机系统中自然捕获的视频中的信息。在这项工作中,我们建议利用视频中的“运动先验”来改善弱监督主动学习环境中的人体分割。通过使用视频中的光流提取运动信息,我们可以提取可能对应于人类片段的候选前景运动片段(称为运动事件)。我们建议学习基于记忆网络的策略模型,通过强化学习选择强候选段(称为强运动先验)。选定的段具有高精度,可直接用于微调模型。在新收集的监视摄像机数据集和公开可用的UrbanStreet数据集中,我们提出的方法改进了跨多个场景和模态(即RGB到红外(IR))的人体分割的性能。最后但并非最不重要的是,我们的方法在经验上与现有的域自适应方法互补,从而通过将我们的弱监督主动学习方法与域自适应方法相结合来实现额外的性能增益。

URL

https://arxiv.org/abs/1807.11436

PDF

https://arxiv.org/pdf/1807.11436.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot