Abstract
Recent advancements in Large Language Models (LLMs) have enhanced the efficacy of agent communication and social interactions. Despite these advancements, building LLM-based agents for reasoning in dynamic environments involving competition and collaboration remains challenging due to the limitations of informed graph-based search methods. We propose PLAYER*, a novel framework based on an anytime sampling-based planner, which utilises sensors and pruners to enable a purely question-driven searching framework for complex reasoning tasks. We also introduce a quantifiable evaluation method using multiple-choice questions and construct the WellPlay dataset with 1,482 QA pairs. Experiments demonstrate PLAYER*'s efficiency and performance enhancements compared to existing methods in complex, dynamic environments with quantifiable results.
Abstract (translated)
近年来,大型语言模型(LLMs)的进步增强了智能体通信和社交互动的有效性。然而,由于基于信息图的搜索方法的局限性,为推理涉及竞争和合作的动态环境构建LLM代理仍然具有挑战性。我们提出了PLAYER*,一种基于时间采样基于规划器的全新框架,利用传感器和剪枝器实现了一个完全基于问题的搜索框架,以解决复杂推理任务。我们还引入了使用多项选择问题进行定量评估的方法,并构建了WellPlay数据集,其中包括1,482个QA对。实验证明了PLAYER*与现有方法在具有定量结果的复杂、动态环境中相比的效率和性能提升。
URL
https://arxiv.org/abs/2404.17662