Abstract
Deep reinforcement learning (DRL) is playing an increasingly important role in real-world applications. However, obtaining an optimally performing DRL agent for complex tasks, especially with sparse rewards, remains a significant challenge. The training of a DRL agent can be often trapped in a bottleneck without further progress. In this paper, we propose RICE, an innovative refining scheme for reinforcement learning that incorporates explanation methods to break through the training bottlenecks. The high-level idea of RICE is to construct a new initial state distribution that combines both the default initial states and critical states identified through explanation methods, thereby encouraging the agent to explore from the mixed initial states. Through careful design, we can theoretically guarantee that our refining scheme has a tighter sub-optimality bound. We evaluate RICE in various popular RL environments and real-world applications. The results demonstrate that RICE significantly outperforms existing refining schemes in enhancing agent performance.
Abstract (translated)
深度强化学习(DRL)在现实应用中扮演着越来越重要的角色。然而,为了在复杂任务中获得最优的DRL代理器,特别是稀疏奖励,仍然是一个具有挑战性的问题。DRL代理商的训练常常陷入瓶颈,没有进一步的进步。在本文中,我们提出了RICE,一种创新的强化学习优化方案,它引入了解释方法来突破训练瓶颈。RICE的高层次思路是在组合默认初始状态和通过解释方法确定的临界状态的基础上构建一个新的初始状态分布,从而鼓励代理商从混合初始状态中进行探索。通过仔细的设计,我们可以理论上将我们的优化方案的子最优解界变得更小。我们在各种流行RL环境和现实应用中评估了RICE。结果表明,RICE在提高代理商性能方面显著优于现有的优化方案。
URL
https://arxiv.org/abs/2405.03064