Abstract
Learning from Demonstration allows robots to mimic human actions. However, these methods do not model constraints crucial to ensure safety of the learned skill. Moreover, even when explicitly modelling constraints, they rely on the assumption of a known cost function, which limits their practical usability for task with unknown cost. In this work we propose a two-step optimization process that allow to estimate cost and constraints by decoupling the learning of cost functions from the identification of unknown constraints within the demonstrated trajectories. Initially, we identify the cost function by isolating the effect of constraints on parts of the demonstrations. Subsequently, a constraint leaning method is used to identify the unknown constraints. Our approach is validated both on simulated trajectories and a real robotic manipulation task. Our experiments show the impact that incorrect cost estimation has on the learned constraints and illustrate how the proposed method is able to infer unknown constraints, such as obstacles, from demonstrated trajectories without any initial knowledge of the cost.
Abstract (translated)
通过演示学习可以让机器人模仿人类的行为。然而,这些方法并未对确保学习技能的安全性至关重要的约束进行建模。此外,即使明确建模了约束,它们也依赖于已知成本函数的假设,这限制了它们在未知成本任务上的实用性。在这项工作中,我们提出了一个两步优化过程,允许通过将学习成本函数的影响与演示轨迹中未知约束的识别分离,来估计成本和约束。最初,我们通过隔离约束对演示轨迹部分的影响来确定成本函数。然后,使用约束倾向方法来识别未知的约束。我们对该方法在模拟轨迹和真实机器人操作任务上的验证表明,不正确的成本估计会对学习到的约束产生影响,并说明该方法能够从演示轨迹中推断出未知的约束,例如障碍物,而无需具备任何初始成本知识。
URL
https://arxiv.org/abs/2405.03491