Human-in-the-loop Learning for Dynamic Congestion Games

Abstract
Abstract (translated)
URL
PDF

Abstract

Today mobile users learn and share their traffic observations via crowdsourcing platforms (e.g., Waze). Yet such platforms simply cater to selfish users' myopic interests to recommend the shortest path, and do not encourage enough users to travel and learn other paths for future others. Prior studies focus on one-shot congestion games without considering users' information learning, while our work studies how users learn and alter traffic conditions on stochastic paths in a human-in-the-loop manner. Our analysis shows that the myopic routing policy leads to severe under-exploration of stochastic paths. This results in a price of anarchy (PoA) greater than $2$, as compared to the socially optimal policy in minimizing the long-term social cost. Besides, the myopic policy fails to ensure the correct learning convergence about users' traffic hazard beliefs. To address this, we focus on informational (non-monetary) mechanisms as they are easier to implement than pricing. We first show that existing information-hiding mechanisms and deterministic path-recommendation mechanisms in Bayesian persuasion literature do not work with even (\text{PoA}=\infty). Accordingly, we propose a new combined hiding and probabilistic recommendation (CHAR) mechanism to hide all information from a selected user group and provide state-dependent probabilistic recommendations to the other user group. Our CHAR successfully ensures PoA less than (\frac{5}{4}), which cannot be further reduced by any other informational (non-monetary) mechanism. Besides the parallel network, we further extend our analysis and CHAR to more general linear path graphs with multiple intermediate nodes, and we prove that the PoA results remain unchanged. Additionally, we carry out experiments with real-world datasets to further extend our routing graphs and verify the close-to-optimal performance of our CHAR.

Abstract (translated)

今天的移动用户通过众包平台（例如，Waze）学习和分享他们的交通观察。然而，这样的平台仅迎合了自私用户的狭隘兴趣，以推荐最短的路径，并没有鼓励足够的用户旅行和探索其他路径，为未来的其他人提供学习机会。以前的研究集中在一次性拥堵游戏中，没有考虑用户的信息学习，而我们的研究则研究了用户如何在随机路径上学习和改变交通状况。我们的分析表明，聚类路由策略导致随机路径的严重缺乏探索。这使得社会最优政策（以最小化长期社会成本为目标）在PoA上大于2。此外，聚类策略无法确保用户关于交通危险信念的正确学习收敛。为了解决这个问题，我们关注信息（非货币）机制，因为它们比定价更容易实现。我们首先证明，在贝叶斯说服性文献中的现有信息隐藏机制和确定性路径推荐机制在PoA=无限的情况下无法起作用。因此，我们提出了一个新的综合隐藏和概率推荐（CHAR）机制，用于隐藏所选用户组的所有信息，并为另一组用户提供基于状态的概率推荐。我们的CHAR成功地确保了PoA小于（五分之四），这可以通过其他信息（非货币）机制进一步降低。除了并行网络之外，我们进一步将分析扩展到具有多个中转节点的更一般线性路径图上，并证明了PoA的结果保持不变。此外，我们还在真实世界数据集上进行实验，进一步扩展我们的路由图，并验证我们的CHAR的近最优性能。

URL

https://arxiv.org/abs/2404.15599

PDF

https://arxiv.org/pdf/2404.15599.pdf

Human-in-the-loop Learning for Dynamic Congestion Games

Abstract

Abstract (translated)

URL

PDF Copy

PDF