Paper Reading AI Learner

The Effect of State Representation on LLM Agent Behavior in Dynamic Routing Games

2025-06-18 16:53:38
Lyle Goodyear, Rachel Guo, Ramesh Johari

Abstract

Large Language Models (LLMs) have shown promise as decision-makers in dynamic settings, but their stateless nature necessitates creating a natural language representation of history. We present a unifying framework for systematically constructing natural language "state" representations for prompting LLM agents in repeated multi-agent games. Previous work on games with LLM agents has taken an ad hoc approach to encoding game history, which not only obscures the impact of state representation on agents' behavior, but also limits comparability between studies. Our framework addresses these gaps by characterizing methods of state representation along three axes: action informativeness (i.e., the extent to which the state representation captures actions played); reward informativeness (i.e., the extent to which the state representation describes rewards obtained); and prompting style (or natural language compression, i.e., the extent to which the full text history is summarized). We apply this framework to a dynamic selfish routing game, chosen because it admits a simple equilibrium both in theory and in human subject experiments \cite{rapoport_choice_2009}. Despite the game's relative simplicity, we find that there are key dependencies of LLM agent behavior on the natural language state representation. In particular, we observe that representations which provide agents with (1) summarized, rather than complete, natural language representations of past history; (2) information about regrets, rather than raw payoffs; and (3) limited information about others' actions lead to behavior that more closely matches game theoretic equilibrium predictions, and with more stable game play by the agents. By contrast, other representations can exhibit either large deviations from equilibrium, higher variation in dynamic game play over time, or both.

Abstract (translated)

大型语言模型(LLMs)在动态环境中作为决策者表现出潜力,但它们无状态的特性需要创建自然语言的历史表示。我们提出了一种统一框架,用于系统地构建自然语言“状态”表示,以提示重复多代理游戏中的LLM代理。之前关于使用LLM代理的游戏的工作采用了临时编码游戏历史的方法,这不仅模糊了状态表示对代理行为的影响,还限制了不同研究之间的可比性。我们的框架通过沿三个维度来表征状态表示方法:行动信息量(即,状态表示捕获所玩动作的程度);奖励信息量(即,状态表示描述获得的报酬的程度);以及提示风格(或自然语言压缩,即完整文本历史被总结的程度),解决了这些差距。 我们将此框架应用于动态自私路由游戏,选择该游戏是因为它在理论和人类主体实验中都允许简单的均衡\cite{rapoport_choice_2009}。尽管该游戏相对简单,但我们发现LLM代理行为的关键依赖性在于自然语言状态表示。具体而言,我们观察到提供给代理的以下三种类型的表示会导致其行为更接近博弈论中的均衡预测,并且与代理动态游戏玩法更加稳定:(1) 提供总结过的,而不是完整的自然语言过去历史;(2) 有关遗憾的信息,而非原始收益信息;以及 (3) 对他人行动有限的信息。 相比之下,其他表示可能会表现出偏离均衡的大偏差、时间变化中动态博弈行为的更高变异度,或者两者兼有。

URL

https://arxiv.org/abs/2506.15624

PDF

https://arxiv.org/pdf/2506.15624.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot