Paper Reading AI Learner

Off-the-Grid MARL: a Framework for Dataset Generation with Baselines for Cooperative Offline Multi-Agent Reinforcement Learning

2023-02-01 15:41:27
Claude Formanek, Asad Jeewa, Jonathan Shock, Arnu Pretorius

Abstract

Being able to harness the power of large, static datasets for developing autonomous multi-agent systems could unlock enormous value for real-world applications. Many important industrial systems are multi-agent in nature and are difficult to model using bespoke simulators. However, in industry, distributed system processes can often be recorded during operation, and large quantities of demonstrative data can be stored. Offline multi-agent reinforcement learning (MARL) provides a promising paradigm for building effective online controllers from static datasets. However, offline MARL is still in its infancy, and, therefore, lacks standardised benchmarks, baselines and evaluation protocols typically found in more mature subfields of RL. This deficiency makes it difficult for the community to sensibly measure progress. In this work, we aim to fill this gap by releasing \emph{off-the-grid MARL (OG-MARL)}: a framework for generating offline MARL datasets and algorithms. We release an initial set of datasets and baselines for cooperative offline MARL, created using the framework, along with a standardised evaluation protocol. Our datasets provide settings that are characteristic of real-world systems, including complex dynamics, non-stationarity, partial observability, suboptimality and sparse rewards, and are generated from popular online MARL benchmarks. We hope that OG-MARL will serve the community and help steer progress in offline MARL, while also providing an easy entry point for researchers new to the field.

Abstract (translated)

利用大型静态数据集开发自主多agent系统可以释放巨大的实际价值。许多重要的工业系统是多agent的,难以使用专门的模拟器进行建模。然而,在工业中,分布式系统过程可以在运行时记录,并存储大量演示数据。离线多agent reinforcement learning(MARL)提供了一个有前途的模式,从静态数据集构建有效的在线控制器。然而,离线MARL仍然处于婴儿期,因此缺乏标准化基准、基线和应用协议,通常出现在更成熟的RL子领域。这种缺陷使社区难以合理衡量进展。在这项工作中,我们的目标是释放 emph{off-the-grid MARL (OG-MARL)}:一个框架,用于生成离线MARL数据和算法。我们发布了使用框架创建的一组数据和基线,并标准化了评估协议。我们的数据集提供了现实世界系统的特征设置,包括复杂的动态性、非一致性、部分可观测性、最优性和稀疏奖励,是从流行的在线MARL基准生成的。我们希望 OG-MARL将服务于社区,帮助引导离线MARL的进展,同时也为初学者提供一个容易进入的领域。

URL

https://arxiv.org/abs/2302.00521

PDF

https://arxiv.org/pdf/2302.00521.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot