Abstract
Real-world manipulation problems in heavy clutter require robots to reason about potential contacts with objects in the environment. We focus on pick-and-place style tasks to retrieve a target object from a shelf where some `movable' objects must be rearranged in order to solve the task. In particular, our motivation is to allow the robot to reason over and consider non-prehensile rearrangement actions that lead to complex robot-object and object-object interactions where multiple objects might be moved by the robot simultaneously, and objects might tilt, lean on each other, or topple. To support this, we query a physics-based simulator to forward simulate these interaction dynamics which makes action evaluation during planning computationally very expensive. To make the planner tractable, we establish a connection between the domain of Manipulation Among Movable Objects and Multi-Agent Pathfinding that lets us decompose the problem into two phases our M4M algorithm iterates over. First we solve a multi-agent planning problem that reasons about the configurations of movable objects but does not forward simulate a physics model. Next, an arm motion planning problem is solved that uses a physics-based simulator but does not search over possible configurations of movable objects. We run simulated and real-world experiments with the PR2 robot and compare against relevant baseline algorithms. Our results highlight that M4M generates complex 3D interactions, and solves at least twice as many problems as the baselines with competitive performance.
Abstract (translated)
在堆积如山的复杂物品中,实际的操作问题需要机器人考虑与环境中的对象的可能接触。我们专注于从拿起和放置任务中取出目标对象,在需要重构一些“可移动”对象才能够完成任务的货架上操作。特别是,我们的目标是让机器人考虑非接触性重构行动,导致复杂的机器人-对象和对象-对象相互作用,其中多个对象可能会被机器人同时移动,对象可能会倾斜、相互倚靠或倒塌。为了支持这一点,我们查询了一个基于物理的模拟器, forward simulate 这些相互作用的动态,这使得在规划期间计算代价很高。为了让规划变得可计算,我们建立了操纵可移动对象领域的连接,并将其与多Agent路径搜索领域联系起来,让我们可以将问题分解为两个阶段,我们的M4M算法迭代处理。我们首先解决了一个多Agent规划问题,它考虑可移动对象的配置但并未forward simulate一个物理模型。接下来,我们解决了一个手臂运动规划问题,它使用一个基于物理的模拟器但并未搜索可移动对象的可能配置。我们使用PR2机器人进行了模拟和现实世界实验,并与相关基准算法进行比较。我们的结果显示,M4M生成复杂的三维交互,并解决了至少与基准算法竞争性能的两倍数量的问题。
URL
https://arxiv.org/abs/2303.13352