Abstract
In this paper, we deal with the reality gap from a novel perspective, targeting transferring Deep Reinforcement Learning (DRL) policies learned in simulated environments to the real-world domain for visual control tasks. Instead of adopting the common solutions to the problem by increasing the visual fidelity of synthetic images output from simulators during the training phase, we seek to tackle the problem by translating the real-world image streams back to the synthetic domain during the deployment phase, to make the robot feel at home. We propose this as a lightweight, flexible, and efficient solution for visual control, as 1) no extra transfer steps are required during the expensive training of DRL agents in simulation; 2) the trained DRL agents will not be constrained to being deployable in only one specific real-world environment; 3) the policy training and the transfer operations are decoupled, and can be conducted in parallel. Besides this, we propose a simple yet effective shift loss to constrain the consistency between subsequent frames, which is important for consistent policy outputs. We validate the shift loss for artistic style transfer for videos and domain adaptation, and validate our visual control approach in both indoor and outdoor robotics experiments. A video of our results is available at: https://goo.gl/P76TTo.
Abstract (translated)
在本文中,我们以新颖的视角处理现实差距,将模拟环境中学习的深度强化学习(DRL)策略转移到视觉控制任务的真实领域。我们不是在训练阶段通过增加仿真器输出的合成图像的视觉保真度来采用常见的解决方案,而是在部署阶段通过将真实世界的图像流转换回合成域来解决该问题,使机器人感到宾至如归。我们提出这是一个轻量级,灵活和高效的视觉控制解决方案,因为1)在模拟DRL代理的昂贵培训期间,不需要额外的转移步骤; 2)训练有素的DRL代理不会被限制为只能在一个特定的现实世界环境中部署; 3)政策培训和转移操作是分开的,可以并行进行。除此之外,我们提出了一个简单而有效的转移损失来约束后续帧之间的一致性,这对于一致的策略输出非常重要。我们验证了视频和领域适应的艺术风格转移的转移损失,并在室内和室外机器人实验中验证了我们的视觉控制方法。我们的结果视频可在以下网址获得:https://goo.gl/P76TTo。
URL
https://arxiv.org/abs/1802.00265