Abstract
The sim-to-real gap poses a significant challenge in RL-based multi-agent exploration due to scene quantization and action discretization. Existing platforms suffer from the inefficiency in sampling and the lack of diversity in Multi-Agent Reinforcement Learning (MARL) algorithms across different scenarios, restraining their widespread applications. To fill these gaps, we propose MAexp, a generic platform for multi-agent exploration that integrates a broad range of state-of-the-art MARL algorithms and representative scenarios. Moreover, we employ point clouds to represent our exploration scenarios, leading to high-fidelity environment mapping and a sampling speed approximately 40 times faster than existing platforms. Furthermore, equipped with an attention-based Multi-Agent Target Generator and a Single-Agent Motion Planner, MAexp can work with arbitrary numbers of agents and accommodate various types of robots. Extensive experiments are conducted to establish the first benchmark featuring several high-performance MARL algorithms across typical scenarios for robots with continuous actions, which highlights the distinct strengths of each algorithm in different scenarios.
Abstract (translated)
模拟-现实差距在基于强化学习的多智能体探索中提出了一个重大的挑战,由于场景量化和解码动作的离散化,现有的平台在采样效率和多智能体强化学习(MARL)算法在不同场景下的多样性方面存在低效,限制了它们在各个领域的广泛应用。为了填补这些空白,我们提出了MAexp,一个通用的多智能体探索平台,整合了最先进的MARL算法和代表性的场景。此外,我们还使用点云来表示我们的探索场景,导致高保真度环境映射和采样速度约比现有平台快40倍。此外,配备了基于注意力的多智能体目标生成器和单智能体运动规划器,MAexp可以与任意数量的智能体一起工作,并可以容纳各种类型的机器人。为了确定机器人连续行动场景中多个高性能MARL算法的第一个基准,我们进行了大量实验。这些实验突出了每个算法在不同场景中的独特优势。
URL
https://arxiv.org/abs/2404.12824