Abstract
An object handover between a robot and a human is a coordinated action which is prone to failure for reasons such as miscommunication, incorrect actions and unexpected object properties. Existing works on handover failure detection and prevention focus on preventing failures due to object slip or external disturbances. However, there is a lack of datasets and evaluation methods that consider unpreventable failures caused by the human participant. To address this deficit, we present the multimodal Handover Failure Detection dataset, which consists of failures induced by the human participant, such as ignoring the robot or not releasing the object. We also present two baseline methods for handover failure detection: (i) a video classification method using 3D CNNs and (ii) a temporal action segmentation approach which jointly classifies the human action, robot action and overall outcome of the action. The results show that video is an important modality, but using force-torque data and gripper position help improve failure detection and action segmentation accuracy.
Abstract (translated)
机器人与人类之间的物体传递是一个协调的动作,由于诸如沟通不畅、操作不正确或意外物体属性等原因,容易出现失败。现有关于物体传递失败检测和预防的作品主要集中在由于物体滑落或外部干扰导致的故障的预防上。然而,目前缺乏考虑人类参与者无法预防的故障的数据集和评估方法。为了弥补这一不足,我们提出了多模态物体传递失败检测数据集,其中包括由人类参与者引起的事故,例如忽略机器人或未释放物体。我们还提出了两种基本的物体传递失败检测方法:(i)使用3D CNN的视觉分类方法;(ii)一种将人类动作、机器人动作和动作结果共同分类为时序动作的方法。结果表明,视频是一个重要的模式,但使用力和扭矩数据以及抓爪位置能够提高故障检测和动作分割的准确性。
URL
https://arxiv.org/abs/2402.18319