Abstract
Interesting and efficient collective behavior observed in multi-robot or swarm systems emerges from the individual behavior of the robots. The functional space of individual robot behaviors is in turn shaped or constrained by the robot's morphology or physical design. Thus the full potential of multi-robot systems can be realized by concurrently optimizing the morphology and behavior of individual robots, informed by the environment's feedback about their collective performance, as opposed to treating morphology and behavior choices disparately or in sequence (the classical approach). This paper presents an efficient concurrent design or co-design method to explore this potential and understand how morphology choices impact collective behavior, particularly in an MRTA problem focused on a flood response scenario, where the individual behavior is designed via graph reinforcement learning. Computational efficiency in this case is attributed to a new way of near exact decomposition of the co-design problem into a series of simpler optimization and learning problems. This is achieved through i) the identification and use of the Pareto front of Talent metrics that represent morphology-dependent robot capabilities, and ii) learning the selection of Talent best trade-offs and individual robot policy that jointly maximizes the MRTA performance. Applied to a multi-unmanned aerial vehicle flood response use case, the co-design outcomes are shown to readily outperform sequential design baselines. Significant differences in morphology and learned behavior are also observed when comparing co-designed single robot vs. co-designed multi-robot systems for similar operations.
Abstract (translated)
有趣且高效的集体行为在多机器人或群系统中显现,这种行为源于单个机器人的个体行为。反过来,单个机器人的功能空间受到其形态或物理设计的塑造和限制。因此,要想充分发挥多机器人系统的潜力,可以通过同时优化单个机器人的形态和行为来实现这一目标,这些优化依据环境对其集体表现的反馈进行调整,而不是像经典方法那样分别或按顺序处理形态和行为选择。本文提出了一种有效的同步设计或协同设计方法,以探索这种潜力,并理解形态选择如何影响集体行为,特别是在一个专注于洪水应对场景的任务分配(MRTA)问题中,其中单个机器人的行为是通过图强化学习来设计的。在这种情况下,计算效率归因于一种新的近似精确分解的方法,将协同设计问题分解为一系列更简单的优化和学习问题。这是通过以下两点实现的:i) 识别并利用表示形态依赖机器人能力的Talent指标的帕累托前沿;ii) 学习选择最优的Talent权衡以及单个机器人的策略以共同最大化MRTA性能。在多无人机洪水应对用例中应用该方法,协同设计的结果明显优于顺序设计基准。当比较单一机器人与多机器人系统(针对相似操作)的协同设计方案时,还观察到了显著不同的形态和学习行为差异。
URL
https://arxiv.org/abs/2411.18519