Abstract
We investigate the possibility of using animals videos to improve Reinforcement Learning (RL) efficiency and performance. Under a theoretical perspective, we motivate the use of weighted policy optimization for off-policy RL, describe the main challenges when learning from videos and propose solutions. We test our ideas both in offline and online RL and show encouraging results on a series of 2D navigation tasks.
Abstract (translated)
URL
https://arxiv.org/abs/2209.12347