Communicating human intent to a robotic companion by multi-type gesture sentences

Abstract
Abstract (translated)
URL
PDF

Abstract

Human-Robot collaboration in home and industrial workspaces is on the rise. However, the communication between robots and humans is a bottleneck. Although people use a combination of different types of gestures to complement speech, only a few robotic systems utilize gestures for communication. In this paper, we propose a gesture pseudo-language and show how multiple types of gestures can be combined to express human intent to a robot (i.e., expressing both the desired action and its parameters - e.g., pointing to an object and showing that the object should be emptied into a bowl). The demonstrated gestures and the perceived table-top scene (object poses detected by CosyPose) are processed in real-time) to extract the human's intent. We utilize behavior trees to generate reactive robot behavior that handles various possible states of the world (e.g., a drawer has to be opened before an object is placed into it) and recovers from errors (e.g., when the scene changes). Furthermore, our system enables switching between direct teleoperation of the end-effector and high-level operation using the proposed gesture sentences. The system is evaluated on increasingly complex tasks using a real 7-DoF Franka Emika Panda manipulator. Controlling the robot via action gestures lowered the execution time by up to 60%, compared to direct teleoperation.

Abstract (translated)

人类和机器人在家庭和工业工作空间中的协作正在增加。然而,机器人和人类的通信仍然是一个瓶颈。尽管人们使用各种不同类型的手势来补充语言,但只有少量的机器人系统才使用手势来进行通信。在本文中,我们提出了手势伪语言,并展示了如何结合多种类型的手势来表达人类对机器人的意图(即表达所需的行动及其参数——例如,指向一个物体并显示它应该倒入碗里面)。演示的手势和感知的桌面场景(由CosyPose检测的物体姿势)在实时环境中进行处理,以提取人类的意图。我们利用行为树生成响应机器人行为,处理世界上各种可能的状态(例如,在物体放进去之前必须打开抽屉)并恢复错误(例如,当场景发生变化时)。此外,我们的系统使用 proposed 手势语句来实现间接远程控制和高级操作,使用真实的7自由度Franka Emika Panda操纵器对越来越复杂的任务进行评估。通过使用行动手势来控制机器人,系统的执行时间相比直接远程控制降低了高达60%。

URL

https://arxiv.org/abs/2303.04451

PDF

https://arxiv.org/pdf/2303.04451.pdf