Paper Reading AI Learner

GaitPoint+: A Gait Recognition Network Incorporating Point Cloud Analysis and Recycling

2024-04-16 01:50:10
Huantao Ren, Jiajing Chen, Senem Velipasalar

Abstract

Gait is a behavioral biometric modality that can be used to recognize individuals by the way they walk from a far distance. Most existing gait recognition approaches rely on either silhouettes or skeletons, while their joint use is underexplored. Features from silhouettes and skeletons can provide complementary information for more robust recognition against appearance changes or pose estimation errors. To exploit the benefits of both silhouette and skeleton features, we propose a new gait recognition network, referred to as the GaitPoint+. Our approach models skeleton key points as a 3D point cloud, and employs a computational complexity-conscious 3D point processing approach to extract skeleton features, which are then combined with silhouette features for improved accuracy. Since silhouette- or CNN-based methods already require considerable amount of computational resources, it is preferable that the key point learning module is faster and more lightweight. We present a detailed analysis of the utilization of every human key point after the use of traditional max-pooling, and show that while elbow and ankle points are used most commonly, many useful points are discarded by max-pooling. Thus, we present a method to recycle some of the discarded points by a Recycling Max-Pooling module, during processing of skeleton point clouds, and achieve further performance improvement. We provide a comprehensive set of experimental results showing that (i) incorporating skeleton features obtained by a point-based 3D point cloud processing approach boosts the performance of three different state-of-the-art silhouette- and CNN-based baselines; (ii) recycling the discarded points increases the accuracy further. Ablation studies are also provided to show the effectiveness and contribution of different components of our approach.

Abstract (translated)

步伐是一种行为生物测量方法,可以通过观察一个人从远处走来的方式来识别个体。目前的大多数步伐识别方法依赖于轮廓图或骨骼图,而它们之间的联合应用没有被充分利用。轮廓图和骨骼图的特征可以提供互补信息,以应对外貌变化或姿势估计错误。为了充分利用轮廓图和骨骼图的优势,我们提出了一个新的步伐识别网络,称为GaitPoint+。我们的方法将骨骼关键点建模为3D点云,并采用一种计算复杂性友好的3D点处理方法来提取骨骼特征,然后将这些特征与轮廓图特征相结合以提高准确性。由于轮廓图或CNN方法已经需要相当多的计算资源,因此更快的关键点学习模块和更轻量级的骨架网络更受欢迎。我们对使用传统最大池化方法后每个人体关键点的利用率进行了深入分析,并发现,尽管肘部和足踝关键点最常用,但许多有用的关键点却被最大池化丢弃了。因此,我们提出了一种通过回收被丢弃的关键点来提高骨架点云处理过程性能的方法,并在处理骨架点云的过程中实现进一步的性能提升。我们提供了全面的一组实验结果,表明:(i)通过基于点的方法对3D点云处理技术获得的骨骼特征可以提高三种最先进的轮廓图和CNN基站的性能;(ii)回收被丢弃的关键点可以进一步提高准确性。我们还提供了消融研究,以显示我们方法的不同组件的有效性和贡献。

URL

https://arxiv.org/abs/2404.10213

PDF

https://arxiv.org/pdf/2404.10213.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot