Abstract
Previous animatronic faces struggle to express emotions effectively due to hardware and software limitations. On the hardware side, earlier approaches either use rigid-driven mechanisms, which provide precise control but are difficult to design within constrained spaces, or tendon-driven mechanisms, which are more space-efficient but challenging to control. In contrast, we propose a hybrid actuation approach that combines the best of both worlds. The eyes and mouth-key areas for emotional expression-are controlled using rigid mechanisms for precise movement, while the nose and cheek, which convey subtle facial microexpressions, are driven by strings. This design allows us to build a compact yet versatile hardware platform capable of expressing a wide range of emotions. On the algorithmic side, our method introduces a self-modeling network that maps motor actions to facial landmarks, allowing us to automatically establish the relationship between blendshape coefficients for different facial expressions and the corresponding motor control signals through gradient backpropagation. We then train a neural network to map speech input to corresponding blendshape controls. With our method, we can generate distinct emotional expressions such as happiness, fear, disgust, and anger, from any given sentence, each with nuanced, emotion-specific control signals-a feature that has not been demonstrated in earlier systems. We release the hardware design and code at this https URL and this https URL.
Abstract (translated)
之前的人形机器人面部在表达情感时由于硬件和软件的限制而显得不够有效。从硬件角度来看,早期的方法要么使用刚性驱动机制,这种方法能够提供精确控制但难以在有限的空间内设计;要么使用肌腱驱动机制,这种方式虽然更节省空间但控制起来更加困难。相比之下,我们提出了一种混合致动方法,结合了两种方式的优点。眼睛和嘴巴是情感表达的关键部位,这些区域采用刚性机制进行精细运动控制,而鼻子和脸颊则通过线缆驱动,以传达微妙的面部微表情。这种设计使我们可以构建一个既紧凑又多功能的硬件平台,能够表达广泛的情感。 在算法方面,我们的方法引入了一个自建模网络,该网络将电机动作映射到面部特征点上,并允许我们自动建立不同面部表情的变形形状系数与相应的电机控制信号之间的关系通过反向传播梯度实现。接着训练一个神经网络以从语音输入中映射出对应的变形形状控制器。借助我们的方法,可以从任何给定句子生成如快乐、恐惧、厌恶和愤怒等独特的表情,每个情感都具有细微且特定的情绪化控制信号——这是之前系统未曾展示过的特性。 我们将硬件设计和代码发布在这个链接:[此URL] 和这个链接:[此URL](请将“https URL”替换为实际的链接地址)。
URL
https://arxiv.org/abs/2507.16645