Abstract
Deep learning research has made many biometric recognition solution viable, but it requires vast training data to achieve real-world generalization. Unlike other biometric traits, such as face and ear, gait samples cannot be easily crawled from the web to form massive unconstrained datasets. As the human body has been extensively studied for different digital applications, one can rely on prior shape knowledge to overcome data scarcity. This work follows the recent trend of fitting a 3D deformable body model into gait videos using deep neural networks to obtain disentangled shape and pose representations for each frame. To enforce temporal consistency in the network, we introduce a new Linear Dynamical Systems (LDS) module and loss based on Koopman operator theory, which provides an unsupervised motion regularization for the periodic nature of gait, as well as a predictive capacity for extending gait sequences. We compare LDS to the traditional adversarial training approach and use the USF HumanID and CASIA-B datasets to show that LDS can obtain better accuracy with less training data. Finally, we also show that our 3D modeling approach is much better than other 3D gait approaches in overcoming viewpoint variation under normal, bag-carrying and clothing change conditions.
Abstract (translated)
深度学习研究已经使许多生物特征识别解决方案成为可能,但要实现现实世界的泛化,需要大量的训练数据。与面部和耳朵等生物特征不同,步态样本难以从网络上爬取,以形成大量没有限制的dataset。由于人类身体已经被广泛应用于各种数字应用中,可以依靠先前的形状知识来克服数据稀缺的问题。这项工作遵循了最近的趋势,使用深度神经网络将3D可编辑身体模型嵌入步态视频,以获得每个帧的分离形状和姿态表示。为了在网络中实现时间一致性,我们引入了新的线性动态系统(LDS)模块,并基于 Koopman 操作理论计算损失,该损失为步态的周期性性质提供了 unsupervised 的运动Regularization,并为提高步态序列预测能力提供了预测能力。我们比较了 LDS 与传统对抗训练方法,并使用 USF 人类ID 和 CASIA-B 数据集证明了 LDS 在更少训练数据的情况下能够获得更好的准确性。最后,我们还展示了我们的3D建模方法比其他3D步态方法在正常、背包携带和服装变化条件下克服视角变化方面更好。
URL
https://arxiv.org/abs/2308.07468