Abstract
Game-theoretic models are effective tools for modeling multi-agent interactions, especially when robots need to coordinate with humans. However, applying these models requires inferring their specifications from observed behaviors -- a challenging task known as the inverse game problem. Existing inverse game approaches often struggle to account for behavioral uncertainty and measurement noise, and leverage both offline and online data. To address these limitations, we propose an inverse game method that integrates a generative trajectory model into a differentiable mixed-strategy game framework. By representing the mixed strategy with a conditional variational autoencoder (CVAE), our method can infer high-dimensional, multi-modal behavior distributions from noisy measurements while adapting in real-time to new observations. We extensively evaluate our method in a simulated navigation benchmark, where the observations are generated by an unknown game model. Despite the model mismatch, our method can infer Nash-optimal actions comparable to those of the ground-truth model and the oracle inverse game baseline, even in the presence of uncertain agent objectives and noisy measurements.
Abstract (translated)
游戏理论模型是模拟多智能体交互的有效工具,尤其是在机器人需要与人类进行协调时。然而,应用这些模型需要从观察到的行为中推断其规范——这是一个被称为逆向游戏问题的艰巨任务。现有的逆向游戏方法通常难以应对行为不确定性及测量噪声,并且依赖于离线和在线数据。为了克服这些限制,我们提出了一种结合生成轨迹模型与可微混合策略博弈框架的逆向游戏方法。通过用条件变分自动编码器(CVAE)表示混合策略,我们的方法可以从嘈杂的测量中推断出高维、多模态的行为分布,并实时适应新的观察结果。 我们在一个模拟导航基准测试中全面评估了该方法,在这个基准测试中,观测数据是由未知游戏模型生成的。即使存在模型不匹配的问题,当面对不确定的目标和噪声测量时,我们的方法仍然能够推断出与真实模型及Oracle逆向博弈基线相当的纳什最优行动。 这种方法在处理复杂、不确定性高的多智能体系统中展现出强大的潜力,尤其是涉及人类机器人交互的应用场景中。
URL
https://arxiv.org/abs/2502.03356