GaitMAST: Motion-Aware Spatio-Temporal Feature Learning Network for Cross-View Gait Recognition

Abstract
Abstract (translated)
URL
PDF

Abstract

As a unique biometric that can be perceived at a distance, gait has broad applications in person authentication, social security and so on. Existing gait recognition methods pay attention to extracting either spatial or spatiotemporal representations. However, they barely consider extracting diverse motion features, a fundamental characteristic in gaits, from gait sequences. In this paper, we propose a novel motion-aware spatiotemporal feature learning network for gait recognition, termed GaitMAST, which can unleash the potential of motion-aware features. In the shallow layer, specifically, we propose a dual-path frame-level feature extractor, in which one path extracts overall spatiotemporal features and the other extracts motion salient features by focusing on dynamic regions. In the deeper layers, we design a two-branch clip-level feature extractor, in which one focuses on fine-grained spatial information and the other on motion detail preservation. Consequently, our GaitMAST preserves the individual's unique walking patterns well, further enhancing the robustness of spatiotemporal features. Extensive experimental results on two commonly-used cross-view gait datasets demonstrate the superior performance of GaitMAST over existing state-of-the-art methods. On CASIA-B, our model achieves an average rank-1 accuracy of 94.1%. In particular, GaitMAST achieves rank-1 accuracies of 96.1% and 88.1% under the bag-carry and coat wearing conditions, respectively, outperforming the second best by a large margin and demonstrating its robustness against spatial variations.

Abstract (translated)

URL

https://arxiv.org/abs/2210.11817

PDF

https://arxiv.org/pdf/2210.11817.pdf