Abstract
Humans rely on high-level meta-representations to engage in abstract reasoning. In complex cognitive tasks, these meta-representations help individuals abstract general rules from experience. However, constructing such meta-representations from high-dimensional observations remains a longstanding challenge for reinforcement learning agents. For instance, a well-trained agent often fails to generalize to even minor variations of the same task, such as changes in background color, while humans can easily handle. In this paper, we build a bridge between meta-representation and generalization, showing that generalization performance benefits from meta-representation learning. We also hypothesize that deep mutual learning (DML) among agents can help them converge to meta-representations. Empirical results provide support for our theory and hypothesis. Overall, this work provides a new perspective on the generalization of deep reinforcement learning.
Abstract (translated)
人类依赖高层次的元表示来进行抽象推理。在复杂的认知任务中,这些元表示帮助个体从经验中提炼出一般的规则。然而,从高维观察中构建这样的元表示仍然是强化学习代理面临的一个长期挑战。例如,一个经过良好训练的代理往往无法将所学的一般规则推广到同一任务中的细微变化上,比如背景颜色的变化,而人类则可以轻松应对这种变化。在这篇论文中,我们建立了一个连接元表示和泛化的桥梁,并展示了泛化性能可以从元表示学习中获益。此外,我们假设代理之间的深度互学习(DML)可以帮助它们收敛到元表示。实证结果支持了我们的理论和假设。总体而言,这项工作为深度强化学习的泛化提供了一种新的视角。
URL
https://arxiv.org/abs/2501.02481