Paper Reading AI Learner

The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability

2025-06-11 17:06:57
Jiachen Hu, Rui Ai, Han Zhong, Xiaoyu Chen, Liwei Wang, Zhaoran Wang, Zhuoran Yang

Abstract

Information asymmetry is a pervasive feature of multi-agent systems, especially evident in economics and social sciences. In these settings, agents tailor their actions based on private information to maximize their rewards. These strategic behaviors often introduce complexities due to confounding variables. Simultaneously, knowledge transportability poses another significant challenge, arising from the difficulties of conducting experiments in target environments. It requires transferring knowledge from environments where empirical data is more readily available. Against these backdrops, this paper explores a fundamental question in online learning: Can we employ non-i.i.d. actions to learn about confounders even when requiring knowledge transfer? We present a sample-efficient algorithm designed to accurately identify system dynamics under information asymmetry and to navigate the challenges of knowledge transfer effectively in reinforcement learning, framed within an online strategic interaction model. Our method provably achieves learning of an $\epsilon$-optimal policy with a tight sample complexity of $O(1/\epsilon^2)$.

Abstract (translated)

信息不对称是多代理系统中普遍存在的一种特征,尤其在经济学和社会科学领域表现得尤为明显。在这种背景下,各主体根据私有信息调整行为以最大化自身收益。这种策略性行为往往由于混淆变量的引入而变得复杂。同时,在目标环境中进行实验的难度也带来了知识迁移的重大挑战,这需要将知识从数据更容易获取的环境转移到其他场景中。在此背景下,本文探讨了在线学习中的一个基本问题:我们能否利用非独立同分布(non-i.i.d.)的动作来了解混淆变量,即使在这种情况下仍需实现知识迁移?为此,我们提出了一种样本效率高的算法,旨在准确识别信息不对称条件下的系统动态,并在强化学习框架内有效应对知识转移的挑战,在一个在线策略互动模型下进行。我们的方法可以证明,能够在具有紧致样本复杂度$O(1/\epsilon^2)$的情况下,实现$\epsilon$-最优策略的学习。

URL

https://arxiv.org/abs/2506.09940

PDF

https://arxiv.org/pdf/2506.09940.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot