Abstract
Motion style transfer is a significant research direction in multimedia applications. It enables the rapid switching of different styles of the same motion for virtual digital humans, thus vastly increasing the diversity and realism of movements. It is widely applied in multimedia scenarios such as movies, games, and the Metaverse. However, most of the current work in this field adopts the GAN, which may lead to instability and convergence issues, making the final generated motion sequence somewhat chaotic and unable to reflect a highly realistic and natural style. To address these problems, we consider style motion as a condition and propose the Style Motion Conditioned Diffusion (SMCD) framework for the first time, which can more comprehensively learn the style features of motion. Moreover, we apply Mamba model for the first time in the motion style transfer field, introducing the Motion Style Mamba (MSM) module to handle longer motion sequences. Thirdly, aiming at the SMCD framework, we propose Diffusion-based Content Consistency Loss and Content Consistency Loss to assist the overall framework's training. Finally, we conduct extensive experiments. The results reveal that our method surpasses state-of-the-art methods in both qualitative and quantitative comparisons, capable of generating more realistic motion sequences.
Abstract (translated)
运动风格迁移是多媒体应用中的一个重要研究方向。它使得虚拟数字人可以快速切换相同运动风格的不同样式,从而极大地增加了运动的多样性和现实感。它广泛应用于电影、游戏和元宇宙等多媒体场景。然而,目前这个领域的大多数工作采用生成对抗网络(GAN),这可能导致不稳定和收敛问题,使得最终生成的运动序列有些混乱,无法反映高度真实和自然风格。为了解决这些问题,我们考虑将风格迁移作为一种条件,并首次提出了Style Motion Conditioned Diffusion(SMCD)框架。此外,我们在运动风格迁移领域中首次应用了Mamba模型,引入了Motion Style Mamba(MSM)模块来处理较长的运动序列。第三,为了支持SMCD框架,我们提出了基于扩散的内容的一致性损失和一致性损失来协助整个框架的训练。最后,我们进行了广泛的实验。结果表明,我们的方法在质量和数量上均超过了最先进的Method,能够生成更真实的运动序列。
URL
https://arxiv.org/abs/2405.02844