Paper Reading AI Learner

AgentGroupChat-V2: Divide-and-Conquer Is What LLM-Based Multi-Agent System Need

2025-06-18 13:24:04
Zhouhong Gu, Xiaoxuan Zhu, Yin Cai, Hao Shen, Xingzhou Chen, Qingyi Wang, Jialin Li, Xiaoran Shi, Haoran Guo, Wenxuan Huang, Hongwei Feng, Yanghua Xiao, Zheyu Ye, Yao Hu, Shaosheng Cao

Abstract

Large language model based multi-agent systems have demonstrated significant potential in social simulation and complex task resolution domains. However, current frameworks face critical challenges in system architecture design, cross-domain generalizability, and performance guarantees, particularly as task complexity and number of agents increases. We introduces AgentGroupChat-V2, a novel framework addressing these challenges through three core innovations: (1) a divide-and-conquer fully parallel architecture that decomposes user queries into hierarchical task forest structures enabling dependency management and distributed concurrent processing. (2) an adaptive collaboration engine that dynamically selects heterogeneous LLM combinations and interaction modes based on task characteristics. (3) agent organization optimization strategies combining divide-and-conquer approaches for efficient problem decomposition. Extensive experiments demonstrate AgentGroupChat-V2's superior performance across diverse domains, achieving 91.50% accuracy on GSM8K (exceeding the best baseline by 5.6 percentage points), 30.4% accuracy on competition-level AIME (nearly doubling other methods), and 79.20% pass@1 on HumanEval. Performance advantages become increasingly pronounced with higher task difficulty, particularly on Level 5 MATH problems where improvements exceed 11 percentage points compared to state-of-the-art baselines. These results confirm that AgentGroupChat-V2 provides a comprehensive solution for building efficient, general-purpose LLM multi-agent systems with significant advantages in complex reasoning scenarios. Code is available at this https URL.

Abstract (translated)

基于大型语言模型的多智能体系统在社会模拟和复杂任务解决领域展现出了巨大的潜力。然而,当前框架面临着架构设计、跨域泛化能力以及性能保证等方面的挑战,尤其是在任务复杂度增加和代理数量增多时更为显著。我们在此介绍AgentGroupChat-V2,这是一个通过三大创新来应对这些挑战的新型框架: 1. 分治全并行架构:将用户查询分解为层次化的任务森林结构,以管理依赖关系,并实现分布式并发处理。 2. 自适应协作引擎:根据任务特性动态选择异构大型语言模型组合和交互模式。 3. 结合分治方法的问题优化组织策略,以高效地进行问题分解。 广泛的实验表明,AgentGroupChat-V2在多个领域中均表现出色: - 在GSM8K数据集上达到了91.50%的准确性(优于最佳基线模型5.6个百分点)。 - 在竞赛级别的AIME数据集中达到了30.4%的准确率(几乎翻倍于其他方法)。 - 在HumanEval数据集上的通过率为79.20%。 随着任务难度增加,性能优势变得更加明显,在Level 5 MATH问题上相较于最先进的基线模型提高了超过11个百分点。这些结果证实了AgentGroupChat-V2能够为构建高效、通用的大型语言模型多智能体系统提供全面解决方案,并且在复杂的推理场景中具有显著的优势。 源代码可在以下链接获取:[此URL](https://this-url.com)

URL

https://arxiv.org/abs/2506.15451

PDF

https://arxiv.org/pdf/2506.15451.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot