Paper Reading AI Learner

TriGait: Aligning and Fusing Skeleton and Silhouette Gait Data via a Tri-Branch Network

2023-08-25 12:19:51
Yan Sun, Xueling Feng, Liyan Ma, Long Hu, Mark Nixon

Abstract

Gait recognition is a promising biometric technology for identification due to its non-invasiveness and long-distance. However, external variations such as clothing changes and viewpoint differences pose significant challenges to gait recognition. Silhouette-based methods preserve body shape but neglect internal structure information, while skeleton-based methods preserve structure information but omit appearance. To fully exploit the complementary nature of the two modalities, a novel triple branch gait recognition framework, TriGait, is proposed in this paper. It effectively integrates features from the skeleton and silhouette data in a hybrid fusion manner, including a two-stream network to extract static and motion features from appearance, a simple yet effective module named JSA-TC to capture dependencies between all joints, and a third branch for cross-modal learning by aligning and fusing low-level features of two modalities. Experimental results demonstrate the superiority and effectiveness of TriGait for gait recognition. The proposed method achieves a mean rank-1 accuracy of 96.0% over all conditions on CASIA-B dataset and 94.3% accuracy for CL, significantly outperforming all the state-of-the-art methods. The source code will be available at this https URL.

Abstract (translated)

步态识别是一种有前途的生物特征识别技术,因为它非侵入性和远距离。然而,外部变化,例如服装变化和视角差异,对步态识别提出了巨大的挑战。基于轮廓的方法维持身体形状,但忽略了内部结构信息,而基于骨骼的方法维持结构信息,但忽略了外观。为了充分利用两种模式学的特征互补性,在本文中提出了一种全新的三分支步态识别框架,名为TriGait。它通过混合融合方式有效地整合了骨骼和轮廓数据的特征,包括一个二流网络提取外观静态和运动特征,一个简单但有效的模块名为JSA-TC,用于捕捉所有关节之间的依赖关系,以及第三个分支,通过对齐和融合两种模式学的低层次特征,进行跨模态学习。实验结果证明了TriGait对步态识别的优越性和有效性。该方法在CASIA-B数据集上在所有条件上都实现了100%的平均准确率,而在CL数据集中实现了94.3%的准确率,显著超越了所有最先进的方法。源代码将在这个httpsURL上提供。

URL

https://arxiv.org/abs/2308.13340

PDF

https://arxiv.org/pdf/2308.13340.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot