Abstract
In this paper, we address the challenge of generating realistic 3D human motions for action classes that were never seen during the training phase. Our approach involves decomposing complex actions into simpler movements, specifically those observed during training, by leveraging the knowledge of human motion contained in GPTs models. These simpler movements are then combined into a single, realistic animation using the properties of diffusion models. Our claim is that this decomposition and subsequent recombination of simple movements can synthesize an animation that accurately represents the complex input action. This method operates during the inference phase and can be integrated with any pre-trained diffusion model, enabling the synthesis of motion classes not present in the training data. We evaluate our method by dividing two benchmark human motion datasets into basic and complex actions, and then compare its performance against the state-of-the-art.
Abstract (translated)
在本文中,我们解决了在训练阶段从未见过的动作类中生成逼真的3D人体运动的问题。我们的方法利用GPT模型的知识,通过分解复杂动作为训练期间观察到的更简单的动作,具体这些动作,然后将这些简单的动作合并成一个真实的动画,利用扩散模型的特性。我们的主张是,这种分解和后续简单动作的重新组合可以合成准确地表示复杂输入动作的动画。这种方法在推理阶段运行,可以与任何预训练的扩散模型集成,从而合成训练数据中没有的动量类别。我们通过将两个基准的人体运动数据集分为基本和复杂动作,然后与最先进的水平进行比较,来评估我们的方法。
URL
https://arxiv.org/abs/2409.11920