Abstract
Reinforcement Learning is a promising tool for learning complex policies even in fast-moving and object-interactive domains where human teleoperation or hard-coded policies might fail. To effectively reflect this challenging category of tasks, we introduce a dynamic, interactive RL testbed based on robot air hockey. By augmenting air hockey with a large family of tasks ranging from easy tasks like reaching, to challenging ones like pushing a block by hitting it with a puck, as well as goal-based and human-interactive tasks, our testbed allows a varied assessment of RL capabilities. The robot air hockey testbed also supports sim-to-real transfer with three domains: two simulators of increasing fidelity and a real robot system. Using a dataset of demonstration data gathered through two teleoperation systems: a virtualized control environment, and human shadowing, we assess the testbed with behavior cloning, offline RL, and RL from scratch.
Abstract (translated)
强化学习在快速移动和物体交互领域中学习复杂策略是非常有前途的工具,即使在这种情况下,人类遥控或硬编码策略也可能会失败。为了有效反映这种具有挑战性的任务类别,我们基于机器人冰球引入了一个动态、交互式的RL测试平台。通过增加一个大型任务家族,从简单的任务(如达到)到具有挑战性的任务(如用球推动一个块),以及基于目标和人类交互的任务,我们的测试平台允许对RL能力进行多样评估。机器人冰球测试平台还支持从模拟器到实物的转移,包括不断提高模拟器精度的两个模拟器和一个真实机器人系统。通过通过两个遥控系统收集的演示数据,我们使用行为克隆、离线RL和从头开始RL对测试平台进行评估。
URL
https://arxiv.org/abs/2405.03113