Abstract
Motion capture using sparse inertial sensors has shown great promise due to its portability and lack of occlusion issues compared to camera-based tracking. Existing approaches typically assume that IMU sensors are tightly attached to the human body. However, this assumption often does not hold in real-world scenarios. In this paper, we present a new task of full-body human pose estimation using sparse, loosely attached IMU sensors. To solve this task, we simulate IMU recordings from an existing garment-aware human motion dataset. We developed transformer-based diffusion models to synthesize loose IMU data and estimate human poses based on this challenging loose IMU data. In addition, we show that incorporating garment-related parameters while training the model on simulated loose data effectively maintains expressiveness and enhances the ability to capture variations introduced by looser or tighter garments. Experiments show that our proposed diffusion methods trained on simulated and synthetic data outperformed the state-of-the-art methods quantitatively and qualitatively, opening up a promising direction for future research.
Abstract (translated)
使用稀疏惯性传感器进行动作捕捉由于其便携性和相比基于摄像头的跟踪较少出现遮挡问题而展现出巨大潜力。现有的方法通常假设惯性测量单元(IMU)传感器紧密附着在人体上,但在实际应用场景中这一假设往往不成立。本文提出了一项新的任务:使用稀疏、松散连接的IMU传感器进行全身人类姿态估计。为了解决这个问题,我们从一个已有的服装感知的人体运动数据集中模拟了IMU记录,并开发了基于变压器的扩散模型来合成松散的IMU数据并根据这些挑战性的松散IMU数据估算人体姿势。 此外,研究还表明,在训练过程中结合与服装相关的参数可以有效地保持表现力,并增强捕捉由更宽松或紧身服装引入变化的能力。实验结果表明,我们的方法在模拟和合成数据上训练出的扩散模型无论是定量还是定性都优于当前最先进的方法,为未来的研究开辟了有前途的方向。
URL
https://arxiv.org/abs/2506.15290