Paper Reading AI Learner

GPGait: Generalized Pose-based Gait Recognition

2023-03-09 13:17:13
Yang Fu, Shibei Meng, Saihui Hou, Xuecai Hu, Yongzhen Huang

Abstract

Recent works on pose-based gait recognition have demonstrated the potential of using such simple information to achieve results comparable to silhouette-based methods. However, the generalization ability of pose-based methods on different datasets is undesirably inferior to that of silhouette-based ones, which has received little attention but hinders the application of these methods in real-world scenarios. To improve the generalization ability of pose-based methods across datasets, we propose a Generalized Pose-based Gait recognition (GPGait) framework. First, a Human-Oriented Transformation (HOT) and a series of Human-Oriented Descriptors (HOD) are proposed to obtain a unified pose representation with discriminative multi-features. Then, given the slight variations in the unified representation after HOT and HOD, it becomes crucial for the network to extract local-global relationships between the keypoints. To this end, a Part-Aware Graph Convolutional Network (PAGCN) is proposed to enable efficient graph partition and local-global spatial feature extraction. Experiments on four public gait recognition datasets, CASIA-B, OUMVLP-Pose, Gait3D and GREW, show that our model demonstrates better and more stable cross-domain capabilities compared to existing skeleton-based methods, achieving comparable recognition results to silhouette-based ones. The code will be released.

Abstract (translated)

近年来,基于姿势的步态识别研究已经表明,使用这种简单的信息可以达到与轮廓方法类似的结果。然而,不同数据集上基于姿势的方法的泛化能力却比基于轮廓的方法差,这虽然未被广泛关注,但却限制了这些方法在现实世界场景中的应用。为了提高基于姿势的方法在不同数据集上的泛化能力,我们提出了一个基于姿势的通用步态识别框架(GPGait)。首先,我们提出了一个面向人的变换(hot)和一个面向人的描述符(HOD),以获得具有区分性的多特征的统一姿势表示。然后,在hot和HOD之后的统一表示略有变化,因此,网络必须提取关键点之间的局部global关系。为此,我们提出了一种部分 aware Graph Convolutional Network(PAGCN),以高效地划分 Graph 并提取局部global空间特征。在四个公开步态识别数据集上进行了实验,包括 CASIA-B、OUMVLP-Pose、 Gait3D 和 GREW,结果显示,我们的模型相比现有的骨骼结构方法表现出更好的和更稳定的跨域能力,达到了与轮廓方法类似的识别结果。代码将发布。

URL

https://arxiv.org/abs/2303.05234

PDF

https://arxiv.org/pdf/2303.05234.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot