Abstract
Learning to effectively imitate human teleoperators, with generalization to unseen and dynamic environments, is a promising path to greater autonomy enabling robots to steadily acquire complex skills from supervision. We propose a new motion learning technique rooted in contraction theory and sum-of-squares programming for estimating a control law in the form of a polynomial vector field from a given set of demonstrations. Notably, this vector field is provably optimal for the problem of minimizing imitation loss while providing continuous-time guarantees on the induced imitation behavior. Our method generalizes to new initial and goal poses of the robot and can adapt in real-time to dynamic obstacles during execution, with convergence to teleoperator behavior within a well-defined safety tube. We present an application of our framework for pick-and-place tasks in the presence of moving obstacles on a 7-DOF KUKA IIWA arm. The method compares favorably to other learning-from-demonstration approaches on benchmark handwriting imitation tasks.
Abstract (translated)
学习如何有效地模仿人类的遥控器,将其推广到看不见和动态的环境中,是实现更大自主性的一条有希望的途径,使机器人能够从监督中稳定地获得复杂的技能。本文提出了一种基于收缩理论和平方和规划的运动学习新技术,用于从给定的一组演示中估计多项式向量场形式的控制律。值得注意的是,这个向量场对于最小化模拟损失,同时对诱导的模拟行为提供连续时间保证的问题是可以证明的最佳的。我们的方法推广到机器人的新初始和目标姿态,并且能够实时适应执行过程中的动态障碍物,在一个定义明确的安全管内收敛到遥控器行为。我们提出了我们的框架的一个应用程序,用于在7自由度Kuka IIwa臂上存在移动障碍物的情况下选择和放置任务。该方法与其他在基准笔迹模仿任务中从演示方法中学习的方法相比,具有更好的优势。
URL
https://arxiv.org/abs/1905.09499