Abstract
In this work, we introduce dual goal representations for goal-conditioned reinforcement learning (GCRL). A dual goal representation characterizes a state by "the set of temporal distances from all other states"; in other words, it encodes a state through its relations to every other state, measured by temporal distance. This representation provides several appealing theoretical properties. First, it depends only on the intrinsic dynamics of the environment and is invariant to the original state representation. Second, it contains provably sufficient information to recover an optimal goal-reaching policy, while being able to filter out exogenous noise. Based on this concept, we develop a practical goal representation learning method that can be combined with any existing GCRL algorithm. Through diverse experiments on the OGBench task suite, we empirically show that dual goal representations consistently improve offline goal-reaching performance across 20 state- and pixel-based tasks.
Abstract (translated)
在这项工作中,我们提出了用于目标条件强化学习(GCRL)的双重目标表示方法。一个双重目标表示通过“从所有其他状态的距离集合”来描述一个状态;换句话说,它是通过与每个其他状态的时间距离关系来编码该状态的。这种表示方式提供了几个吸引人的理论特性。首先,它仅依赖于环境的内在动力学,并且不受原始状态表示的影响。其次,它包含足够的信息以恢复最佳的目标实现策略,同时能够过滤掉外生噪声。基于这一概念,我们开发了一种实用的目标表示学习方法,可以与任何现有的GCRL算法相结合使用。通过在OGBench任务套件上进行的各种实验,我们在经验上证明了双重目标表示在20个状态和像素基的任务中始终能提高离线目标实现的性能。
URL
https://arxiv.org/abs/2510.06714