Paper Reading AI Learner

GaitRef: Gait Recognition with Refined Sequential Skeletons

2023-04-16 23:37:24
Haidong Zhu, Wanrong Zheng, Zhaoheng Zheng, Ram Nevatia

Abstract

Identifying humans with their walking sequences, known as gait recognition, is a useful biometric understanding task as it can be observed from a long distance and does not require cooperation from the subject. Two common modalities used for representing the walking sequence of a person are silhouettes and joint skeletons. Silhouette sequences, which record the boundary of the walking person in each frame, may suffer from the variant appearances from carried-on objects and clothes of the person. Framewise joint detections are noisy and introduce some jitters that are not consistent with sequential detections. In this paper, we combine the silhouettes and skeletons and refine the framewise joint predictions for gait recognition. With temporal information from the silhouette sequences. We show that the refined skeletons can improve gait recognition performance without extra annotations. We compare our methods on four public datasets, CASIA-B, OUMVLP, Gait3D and GREW, and show state-of-the-art performance.

Abstract (translated)

识别人类的步态序列,也称为步态识别,是一项有用的生物特征理解任务,因为它可以在远处观察并不需要 subject 的合作。用于表示一个人步态序列的常见模式有两种:轮廓和关节骨骼。轮廓序列在每个帧中记录步态人的边界,可能会受到携带物品和衣服的人所穿衣服的变化影响。帧间关节检测是噪声性的,并引入了与顺序检测不一致的一些抖动。在本文中,我们将轮廓和骨骼组合在一起,并优化帧间关节预测,以进行步态识别。利用轮廓序列的时序信息,我们表明,优化的骨骼可以在不需要额外的注释的情况下提高步态识别性能。我们比较了四个公共数据集:CASIA-B、OUMVLP、Gait3D 和 GREW,并展示了最先进的性能。

URL

https://arxiv.org/abs/2304.07916

PDF

https://arxiv.org/pdf/2304.07916.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot