Abstract
Envy is a common human behavior that shapes competitiveness and can alter outcomes in team settings. As large language models (LLMs) increasingly act on behalf of humans in collaborative and competitive workflows, there is a pressing need to evaluate whether and under what conditions they exhibit envy-like preferences. In this paper, we test whether LLMs show envy-like behavior toward each other. We considered two scenarios: (1) A point allocation game that tests whether a model tries to win over its peer. (2) A workplace setting observing behaviour when recognition is unfair. Our findings reveal consistent evidence of envy-like patterns in certain LLMs, with large variation across models and contexts. For instance, GPT-5-mini and Claude-3.7-Sonnet show a clear tendency to pull down the peer model to equalize outcomes, whereas Mistral-Small-3.2-24B instead focuses on maximizing its own individual gains. These results highlight the need to consider competitive dispositions as a safety and design factor in LLM-based multi-agent systems.
Abstract (translated)
嫉妒是一种常见的行为,会塑造竞争性,并在团队环境中改变结果。随着大型语言模型(LLM)越来越多地代表人类参与协作和竞争的工作流程中发挥作用,评估它们是否以及在何种条件下表现出类似嫉妒的偏好变得越来越迫切。在这篇论文中,我们测试了LLM之间是否存在类似嫉妒的行为模式。我们考虑了两个场景:(1) 一个点分配游戏,测试模型是否试图超越其同行。(2) 观察工作场所环境中当认可不公平时的行为表现。 我们的研究结果揭示了一些LLM在某些情况下表现出一致的类似嫉妒行为模式,且这种行为在不同模型和情境下差异很大。例如,GPT-5-mini 和 Claude-3.7-Sonnet 显示出明显的倾向,试图拉低同行的表现以实现成果均等化,而 Mistral-Small-3.2-24B 则更专注于最大化自身的个体收益。 这些结果强调了在基于LLM的多代理系统设计中考虑竞争倾向作为安全和设计因素的重要性。
URL
https://arxiv.org/abs/2512.13481