Abstract
Recently, large language models (LLMs) have demonstrated remarkable potential as an intelligent agent. However, existing researches mainly focus on enhancing the agent's reasoning or decision-making abilities through well-designed prompt engineering or task-specific fine-tuning, ignoring the procedure of exploration and exploitation. When addressing complex tasks within open-world interactive environments, these methods exhibit limitations. Firstly, the lack of global information of environments leads to greedy decisions, resulting in sub-optimal solutions. On the other hand, irrelevant information acquired from the environment not only adversely introduces noise, but also incurs additional cost. This paper proposes a novel approach, Weak Exploration to Strong Exploitation (WESE), to enhance LLM agents in solving open-world interactive tasks. Concretely, WESE involves decoupling the exploration and exploitation process, employing a cost-effective weak agent to perform exploration tasks for global knowledge. A knowledge graph-based strategy is then introduced to store the acquired knowledge and extract task-relevant knowledge, enhancing the stronger agent in success rate and efficiency for the exploitation task. Our approach is flexible enough to incorporate diverse tasks, and obtains significant improvements in both success rates and efficiency across four interactive benchmarks.
Abstract (translated)
近年来,大型语言模型(LLMs)已经在智能代理领域取得了显著的潜力。然而,现有的研究主要关注通过精心设计的问题工程或任务特定微调来增强代理的推理或决策能力,而忽略了探索和利用的过程。当处理开放世界交互环境中的复杂任务时,这些方法表现出局限性。首先,环境的全局信息缺乏导致贪心决策,导致最优解。另一方面,从环境中获得的无关信息不仅带来了噪声,而且还会造成额外的代价。本文提出了一种新颖的方法,即弱探索强利用(WESE),以提高LLM代理在解决开放世界交互任务中的性能。具体来说,WESE包括解耦探索和利用过程,使用一种成本效益的弱代理执行全局知识探索任务。然后引入一个知识图,用于存储获得的知识并提取任务相关的知识,从而增强成功率和效率较强的代理在探索任务中的表现。我们的方法足够灵活,可以涵盖各种任务,并且在四个交互基准测试中都取得了显著的改进。
URL
https://arxiv.org/abs/2404.07456