Extensible Grounding of Speech for Robot Instruction

Abstract
Abstract (translated)
URL
PDF

Abstract

Spoken language is a convenient interface for commanding a mobile robot. Yet for this to work a number of base terms must be grounded in perceptual and motor skills. We detail the language processing used on our robot ELI and explain how this grounding is performed, how it interacts with user gestures, and how it handles phenomena such as anaphora. More importantly, however, there are certain concepts which the robot cannot be preprogrammed with, such as the names of various objects in a household or the nature of specific tasks it may be requested to perform. In these cases it is vital that there exist a method for extending the grounding, essentially "learning by being told". We describe how this was successfully implemented for learning new nouns and verbs in a tabletop setting. Creating this language learning kernel may be the last explicit programming the robot ever needs - the core mechanism could eventually be used for imparting a vast amount of knowledge, much as a child learns from its parents and teachers.

Abstract (translated)

口语是指挥移动机器人的便捷界面。然而，为了实现这一目标，许多基础术语必须以感知和运动技能为基础。我们详细介绍了机器人ELI上使用的语言处理，并解释了这种接地的执行方式，它与用户手势的交互方式，以及它如何处理回指等现象。然而，更重要的是，存在机器人不能预先编程的某些概念，例如家庭中各种物体的名称或者可能被请求执行的特定任务的性质。在这些情况下，至关重要的是存在一种扩展基础的方法，主要是“通过被告知学习”。我们描述了如何成功地在桌面设置中学习新名词和动词。创建这种语言学习内核可能是机器人需要的最后一次显式编程 - 核心机制最终可用于传授大量知识，就像孩子从其父母和老师那里学习一样。

URL

https://arxiv.org/abs/1807.11838

PDF

https://arxiv.org/pdf/1807.11838.pdf