Abstract
Large language models (LLMs) are increasingly used to support creative tasks such as research idea generation. While recent work has shown that structured dialogues between LLMs can improve the novelty and feasibility of generated ideas, the optimal design of such interactions remains unclear. In this study, we conduct a comprehensive analysis of multi-agent LLM dialogues for scientific ideation. We compare different configurations of agent roles, number of agents, and dialogue depth to understand how these factors influence the novelty and feasibility of generated ideas. Our experimental setup includes settings where one agent generates ideas and another critiques them, enabling iterative improvement. Our results show that enlarging the agent cohort, deepening the interaction depth, and broadening agent persona heterogeneity each enrich the diversity of generated ideas. Moreover, specifically increasing critic-side diversity within the ideation-critique-revision loop further boosts the feasibility of the final proposals. Our findings offer practical guidelines for building effective multi-agent LLM systems for scientific ideation. Our code is available at this https URL.
Abstract (translated)
大型语言模型(LLMs)在支持创意任务如研究想法生成方面的应用日益增加。尽管最近的工作表明,结构化的LLM对话可以提高生成想法的新颖性和可行性,但此类互动的最佳设计仍不清楚。在这项研究中,我们对用于科学构想的多代理LLM对话进行了全面分析。我们比较了不同配置的代理角色、代理数量以及对话深度,以了解这些因素如何影响所产生想法的新颖性和可行性。我们的实验设置包括一个代理生成想法而另一个对其进行批评的情况,从而可以进行迭代改进。研究结果表明,扩大代理团队规模、加深互动深度和扩展代理人身份异质性都能丰富产生的想法的多样性。此外,在构想-批评-修订循环中特别增加批评方的多样性还可以进一步提高最终提案的可行性。我们的发现为构建有效的多代理LLM系统以支持科学构想提供了实用指南。我们的代码可在[此处](https://this_https_URL.com)获取。
URL
https://arxiv.org/abs/2507.08350