Abstract
Text-based reinforcement learning involves an agent interacting with a fictional environment using observed text and admissible actions in natural language to complete a task. Previous works have shown that agents can succeed in text-based interactive environments even in the complete absence of semantic understanding or other linguistic capabilities. The success of these agents in playing such games suggests that semantic understanding may not be important for the task. This raises an important question about the benefits of LMs in guiding the agents through the game states. In this work, we show that rich semantic understanding leads to efficient training of text-based RL agents. Moreover, we describe the occurrence of semantic degeneration as a consequence of inappropriate fine-tuning of language models in text-based reinforcement learning (TBRL). Specifically, we describe the shift in the semantic representation of words in the LM, as well as how it affects the performance of the agent in tasks that are semantically similar to the training games. We believe these results may help develop better strategies to fine-tune agents in text-based RL scenarios.
Abstract (translated)
基于文本的强化学习涉及一个智能体使用观察到的文本和可允许的动作与虚构环境交互以完成任务。以前的工作表明,即使缺乏语义理解或其他语言能力,基于文本的交互环境中的智能体也可以成功。这些智能体在玩这类游戏中的成功表明,语义理解可能不是任务完成所必需的。这引发了一个重要的问题,即自然语言处理(NLP)模型在引导智能体通过游戏状态方面的优势。在这项工作中,我们证明了丰富的语义理解会导致基于文本的强化学习(TBRL)智能体的有效训练。此外,我们描述了语义退化作为文本基于强化学习(TBRL)中不合适的语言模型微调的结果。具体来说,我们描述了LM中单词语义表示的转移,以及它如何影响与训练游戏相似任务的智能体表现。我们相信,这些结果有助于在文本基于强化学习场景中开发更好的微调策略。
URL
https://arxiv.org/abs/2404.10174