Abstract
Applications of large-scale mobile multi-robot systems can be beneficial over monolithic robots because of higher potential for robustness and scalability. Developing controllers for multi-robot systems is challenging because the multitude of interactions is hard to anticipate and difficult to model. Automatic design using machine learning or evolutionary robotics seem to be options to avoid that challenge, but bring the challenge of designing reward or fitness functions. Generic reward and fitness functions seem unlikely to exist and task-specific rewards often have undesired side effects. Approaches of so-called innate motivation try to avoid the specific formulation of rewards and work instead with different drivers, such as curiosity. Our approach to innate motivation is to minimize surprise, which we implement by maximizing the accuracy of the swarm robot's sensor predictions using neuroevolution. A unique advantage of the swarm robot case is that swarm members populate the robot's environment and can trigger more active behaviors in a self-referential loop. We summarize our previous simulation-based results concerning behavioral diversity, robustness, scalability, and engineered self-organization, and put them into context. In several new studies, we analyze the influence of the optimizer's hyperparameters, the scalability of evolved behaviors, and the impact of realistic robot simulations. Finally, we present results using real robots that show how the reality gap can be bridged.
Abstract (translated)
大规模移动多机器人系统的应用在单一机器人系统上比具有更高的稳健性和可扩展性。开发多机器人系统的控制器具有挑战性,因为涉及的行为数量众多,难以预测和难以建模。利用机器学习或进化机器人学自动设计似乎是避免这一挑战的选项,但同时也带来了设计奖励或目标函数的挑战。似乎不存在通用的奖励和目标函数,而任务特定的奖励往往会产生不良后果。所谓的自发性方法试图避免具体的奖励表述,而是与不同的驱动程序(如好奇心)合作。我们关于自发性方法的策略是,通过增加聚类机器人的传感器预测的准确性来最小化惊喜。多机器人系统案例的自发性方法优势在于,群成员填充机器人的环境,可以在自回归循环中触发更积极的行为。我们将之前基于模拟的研究结果置于上下文中。在几项新研究中,我们分析了优化器的超参数、进化行为的可扩展性和现实机器人模拟的影响。最后,我们用实际的机器人展示了如何弥合现实差距。
URL
https://arxiv.org/abs/2405.02579