Paper Reading AI Learner

Watch Your Pose: Unsupervised Domain Adaption with Pose based Triplet Selection for Gait Recognition

2023-07-13 13:41:32
Gavriel Habib, Noa Barzilay, Or Shimshi, Rami Ben-Ari, Nir Darshan

Abstract

Gait Recognition is a computer vision task aiming to identify people by their walking patterns. Existing methods show impressive results on individual datasets but lack the ability to generalize to unseen scenarios. Unsupervised Domain Adaptation (UDA) tries to adapt a model, pre-trained in a supervised manner on a source domain, to an unlabelled target domain. UDA for Gait Recognition is still in its infancy and existing works proposed solutions to limited scenarios. In this paper, we reveal a fundamental phenomenon in adaptation of gait recognition models, in which the target domain is biased to pose-based features rather than identity features, causing a significant performance drop in the identification task. We suggest Gait Orientation-based method for Unsupervised Domain Adaptation (GOUDA) to reduce this bias. To this end, we present a novel Triplet Selection algorithm with a curriculum learning framework, aiming to adapt the embedding space by pushing away samples of similar poses and bringing closer samples of different poses. We provide extensive experiments on four widely-used gait datasets, CASIA-B, OU-MVLP, GREW, and Gait3D, and on three backbones, GaitSet, GaitPart, and GaitGL, showing the superiority of our proposed method over prior works.

Abstract (translated)

步识别是计算机视觉任务,旨在通过步进模式识别人类。现有方法在个人数据集上表现出令人印象深刻的结果,但缺乏对未知场景的泛化能力。无监督领域适应(UDA)尝试适应在一个源领域中以监督方式训练过的模型,到未标记的目标领域。步识别领域的UDA仍然处于婴儿期,现有工作提出了针对有限场景的解决方案。在本文中,我们揭示了步识别模型适应中的基本概念现象,即目标领域受到姿态特征而不是身份特征的偏见,导致识别任务的性能大幅下降。我们建议步方向based方法(GOUDA)以降低这种偏见。为此,我们提出了一种新的三选一算法,结合课程学习框架,旨在通过推开相似姿态样本并使不同姿态样本更接近来适应嵌入空间。我们提供了广泛的实验,对四种广泛使用的步识别数据集、CASIA-B、OU-MVLP、GREW和Gait3D,以及对三个支柱、GaitSet、GaitPart和GaitGL,展示了我们提出的方法相对于先前工作的优势。

URL

https://arxiv.org/abs/2307.06751

PDF

https://arxiv.org/pdf/2307.06751.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot