Paper Reading AI Learner

Exploring More from Multiple Gait Modalities for Human Identification

2024-12-16 07:15:13
Dongyang Jin, Chao Fan, Weihua Chen, Shiqi Yu

Abstract

The gait, as a kind of soft biometric characteristic, can reflect the distinct walking patterns of individuals at a distance, exhibiting a promising technique for unrestrained human identification. With largely excluding gait-unrelated cues hidden in RGB videos, the silhouette and skeleton, though visually compact, have acted as two of the most prevailing gait modalities for a long time. Recently, several attempts have been made to introduce more informative data forms like human parsing and optical flow images to capture gait characteristics, along with multi-branch architectures. However, due to the inconsistency within model designs and experiment settings, we argue that a comprehensive and fair comparative study among these popular gait modalities, involving the representational capacity and fusion strategy exploration, is still lacking. From the perspectives of fine vs. coarse-grained shape and whole vs. pixel-wise motion modeling, this work presents an in-depth investigation of three popular gait representations, i.e., silhouette, human parsing, and optical flow, with various fusion evaluations, and experimentally exposes their similarities and differences. Based on the obtained insights, we further develop a C$^2$Fusion strategy, consequently building our new framework MultiGait++. C$^2$Fusion preserves commonalities while highlighting differences to enrich the learning of gait features. To verify our findings and conclusions, extensive experiments on Gait3D, GREW, CCPG, and SUSTech1K are conducted. The code is available at this https URL.

Abstract (translated)

步态作为一种软生物特征,可以在远距离下反映出个体独特的行走模式,展现了一种有前景的无约束人体识别技术。通过大量排除隐藏在RGB视频中的与步态无关的信息,轮廓和骨架图像虽然视觉上较为紧凑,但长期以来一直是两种最流行的步态表现形式。最近,一些尝试引入更多具有信息量的数据格式,如人体分割图和光流图来捕捉步态特征,并结合多分支架构。然而,由于模型设计和实验设置的不一致性,我们认为这些流行步态模态之间缺乏全面且公平的比较研究,特别是在表征能力和融合策略探索方面。从精细与粗略形状、整体与像素级运动建模的角度出发,本工作深入探讨了三种流行的步态表示形式,即轮廓图、人体分割图和光流图,并进行了各种融合评估,实验上揭示了它们的相似性和差异性。基于获得的见解,我们进一步开发了一种C$^2$Fusion策略,进而构建了新的框架MultiGait++。C$^2$Fusion保留共性的同时突出差异,以丰富步态特征的学习。为了验证我们的发现和结论,在Gait3D、GREW、CCPG和SUSTech1K上进行了广泛的实验。代码可在此链接https URL处获取。

URL

https://arxiv.org/abs/2412.11495

PDF

https://arxiv.org/pdf/2412.11495.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot