Abstract
Can Large Language Models substitute humans in making important decisions? Recent research has unveiled the potential of LLMs to role-play assigned personas, mimicking their knowledge and linguistic habits. However, imitative decision-making requires a more nuanced understanding of personas. In this paper, we benchmark the ability of LLMs in persona-driven decision-making. Specifically, we investigate whether LLMs can predict characters' decisions provided with the preceding stories in high-quality novels. Leveraging character analyses written by literary experts, we construct a dataset LIFECHOICE comprising 1,401 character decision points from 395 books. Then, we conduct comprehensive experiments on LIFECHOICE, with various LLMs and methods for LLM role-playing. The results demonstrate that state-of-the-art LLMs exhibit promising capabilities in this task, yet there is substantial room for improvement. Hence, we further propose the CHARMAP method, which achieves a 6.01% increase in accuracy via persona-based memory retrieval. We will make our datasets and code publicly available.
Abstract (translated)
大语言模型是否可以在做出重要决策时替代人类?近期的研究揭示了大型语言模型在角色扮演分配人形角色、模仿其知识和语言习惯方面具有潜在功能。然而,模仿决策需要对人格有更细微的理解。在本文中,我们研究了大型语言模型在人物驱动决策中的能力。具体来说,我们研究了 LLMs 是否可以预测高质小说中给出的角色决策。利用文学专家编写的角色分析,我们构建了一个名为 LIFECHOICE 的数据集,包含来自 395 本书的 1,401 个角色决策点。然后,我们對 LIFECHOICE 進行了全面實驗,使用各種 LLM 角色扮演方法和技術。結果表明,最先进的 LLM 在此任務上展現出有前景的能力,但仍有很大的改進空間。因此,我們進一步提出了 CHARMAP 方法,通過基於人格的記憶检索實現了 6.01% 的準確度增加。我們將向公眾提供我們的數據和代碼。
URL
https://arxiv.org/abs/2404.12138