Abstract
LLM-based multi-agent systems (MAS) extend the capabilities of single LLMs by enabling cooperation among multiple specialized agents. However, most existing MAS frameworks rely on a single LLM to drive all agents, constraining the system's intelligence to the limit of that model. This paper explores the paradigm of heterogeneous LLM-driven MAS (X-MAS), where agents are powered by diverse LLMs, elevating the system's potential to the collective intelligence of diverse LLMs. We introduce X-MAS-Bench, a comprehensive testbed designed to evaluate the performance of various LLMs across different domains and MAS-related functions. As an extensive empirical study, we assess 27 LLMs across 5 domains (encompassing 21 test sets) and 5 functions, conducting over 1.7 million evaluations to identify optimal model selections for each domain-function combination. Building on these findings, we demonstrate that transitioning from homogeneous to heterogeneous LLM-driven MAS can significantly enhance system performance without requiring structural redesign. Specifically, in a chatbot-only MAS scenario, the heterogeneous configuration yields up to 8.4\% performance improvement on the MATH dataset. In a mixed chatbot-reasoner scenario, the heterogeneous MAS could achieve a remarkable 47\% performance boost on the AIME dataset. Our results underscore the transformative potential of heterogeneous LLMs in MAS, highlighting a promising avenue for advancing scalable, collaborative AI systems.
Abstract (translated)
基于大型语言模型(LLM)的多智能体系统(MAS)通过允许多个专业化代理之间的合作,扩展了单一LLM的能力。然而,大多数现有的MAS框架依赖于单个LLM来驱动所有代理,从而限制了系统的智能水平到该模型的极限。本文探讨了一种异构大型语言模型驱动的多智能体系统(X-MAS)范式,在这种系统中,各个代理由不同的大型语言模型提供动力,将系统的潜力提升到了多样化的大型语言模型集体智慧的高度。我们介绍了X-MAS-Bench,这是一个全面的测试平台,旨在评估各种LLM在不同领域和MAS相关功能上的表现。作为一项广泛的经验研究,我们在五个领域(涵盖21个测试集)和五种功能上对27种不同的LLM进行了超过170万次评估,以识别每个域-功能组合的最佳模型选择。基于这些发现,我们展示了从同质到异构大型语言模型驱动的多智能体系统的转变可以在不进行结构性重新设计的情况下显著提升系统性能。具体而言,在仅限于聊天机器人的MAS场景中,异构配置在MATH数据集上的表现提高了最多8.4%。在一个混合了聊天机器人和推理者的场景中,异构MAS在AIME数据集上实现了令人瞩目的47%的表现提升。我们的结果强调了异构大型语言模型在多智能体系统中的变革潜力,并为开发可扩展、协作的人工智能系统开辟了一条前景光明的道路。
URL
https://arxiv.org/abs/2505.16997