Abstract
Agent-based models (ABMs) are simulation models used in economics to overcome some of the limitations of traditional frameworks based on general equilibrium assumptions. However, agents within an ABM follow predetermined, not fully rational, behavioural rules which can be cumbersome to design and difficult to justify. Here we leverage multi-agent reinforcement learning (RL) to expand the capabilities of ABMs with the introduction of fully rational agents that learn their policy by interacting with the environment and maximising a reward function. Specifically, we propose a 'Rational macro ABM' (R-MABM) framework by extending a paradigmatic macro ABM from the economic literature. We show that gradually substituting ABM firms in the model with RL agents, trained to maximise profits, allows for a thorough study of the impact of rationality on the economy. We find that RL agents spontaneously learn three distinct strategies for maximising profits, with the optimal strategy depending on the level of market competition and rationality. We also find that RL agents with independent policies, and without the ability to communicate with each other, spontaneously learn to segregate into different strategic groups, thus increasing market power and overall profits. Finally, we find that a higher degree of rationality in the economy always improves the macroeconomic environment as measured by total output, depending on the specific rational policy, this can come at the cost of higher instability. Our R-MABM framework is general, it allows for stable multi-agent learning, and represents a principled and robust direction to extend existing economic simulators.
Abstract (translated)
基于代理的模型(ABMs)是一种用于经济学中的模拟模型,以克服基于一般均衡假设的传统框架的一些局限性。然而,ABM中的代理遵循预设的、非完全理性的行为规则,这可能会导致设计复杂且难以证明。在这里,我们利用多智能体强化学习(RL)来通过引入完全理性的代理来扩展ABM的功能,使得代理通过与环境交互并最大化奖励函数来学习其策略。具体来说,我们提出了一个“理性宏观ABM”框架,该框架在经济学文献中是对典型宏观ABM的扩展。我们证明了,逐步用RL代理商替换模型中的ABM企业,训练以最大化利润,可以深入研究理性对经济的影响。我们发现,RL代理商自发地学习三种最大化利润的策略,最优策略取决于市场竞争的水平和理性程度。我们还发现,具有独立策略的RL代理商,以及无法相互通信的代理商,会自发地学习将企业划分为不同的战略组,从而增加市场实力和整体利润。最后,我们发现,经济中的理性程度越高,总产出水平越大,这取决于具体的理性政策,但这也带来了更高的不稳定性。我们的R-MABM框架是通用的,它允许稳定的多智能体学习,并且代表了扩展现有经济模拟器的一个理性和稳健的方向。
URL
https://arxiv.org/abs/2405.02161