Abstract
LLM-based multi-agent systems (MAS) have demonstrated significant potential in enhancing single LLMs to address complex and diverse tasks in practical applications. Despite considerable advancements, the field lacks a unified codebase that consolidates existing methods, resulting in redundant re-implementation efforts, unfair comparisons, and high entry barriers for researchers. To address these challenges, we introduce MASLab, a unified, comprehensive, and research-friendly codebase for LLM-based MAS. (1) MASLab integrates over 20 established methods across multiple domains, each rigorously validated by comparing step-by-step outputs with its official implementation. (2) MASLab provides a unified environment with various benchmarks for fair comparisons among methods, ensuring consistent inputs and standardized evaluation protocols. (3) MASLab implements methods within a shared streamlined structure, lowering the barriers for understanding and extension. Building on MASLab, we conduct extensive experiments covering 10+ benchmarks and 8 models, offering researchers a clear and comprehensive view of the current landscape of MAS methods. MASLab will continue to evolve, tracking the latest developments in the field, and invite contributions from the broader open-source community.
Abstract (translated)
基于大型语言模型(LLM)的多智能体系统(MAS)在实际应用中展现出提升单一LLM能力以应对复杂和多样化任务的巨大潜力。尽管取得了一定进展,该领域仍缺乏一个统一的代码库来整合现有方法,这导致了重复实现的努力、不公平的比较以及研究人员较高的入门门槛。为解决这些问题,我们引入了MASLab——一个统一、全面且适合研究者使用的基于LLM的MAS代码库。 1. MASLab集成了超过20种跨多个领域的成熟方法,并通过与官方实现逐步骤输出对比的方式严谨验证每一种方法。 2. MASLab提供了一个统一的环境,包括多种基准测试以进行公平的方法比较,确保一致的输入和标准化的评估协议。 3. MASLab在共享的精简结构中实现了各种方法,降低了理解和扩展的门槛。基于MASLab,我们进行了广泛的实验,覆盖了10多个基准测试和8种模型,为研究人员提供了对当前MAS方法领域清晰全面的观点。 未来,MASLab将继续发展,跟踪该领域的最新进展,并邀请更广泛开源社区的贡献。
URL
https://arxiv.org/abs/2505.16988