Abstract
As human-robot collaboration increases in the workforce, it becomes essential for human-robot teams to coordinate efficiently and intuitively. Traditional approaches for human-robot scheduling either utilize exact methods that are intractable for large-scale problems and struggle to account for stochastic, time varying human task performance, or application-specific heuristics that require expert domain knowledge to develop. We propose a deep learning-based framework, called HybridNet, combining a heterogeneous graph-based encoder with a recurrent schedule propagator for scheduling stochastic human-robot teams under upper- and lower-bound temporal constraints. The HybridNet's encoder leverages Heterogeneous Graph Attention Networks to model the initial environment and team dynamics while accounting for the constraints. By formulating task scheduling as a sequential decision-making process, the HybridNet's recurrent neural schedule propagator leverages Long Short-Term Memory (LSTM) models to propagate forward consequences of actions to carry out fast schedule generation, removing the need to interact with the environment between every task-agent pair selection. The resulting scheduling policy network provides a computationally lightweight yet highly expressive model that is end-to-end trainable via Reinforcement Learning algorithms. We develop a virtual task scheduling environment for mixed human-robot teams in a multi-round setting, capable of modeling the stochastic learning behaviors of human workers. Experimental results showed that HybridNet outperformed other human-robot scheduling solutions across problem sizes for both deterministic and stochastic human performance, with faster runtime compared to pure-GNN-based schedulers.
Abstract (translated)
人类和机器人的协作在劳动力中日益增加,因此,人类机器人团队必须高效、直觉地协调。传统的人类机器人调度方法要么使用无法处理大规模问题的具体方法,并努力处理随机、时间变化的人类任务表现,要么需要专家领域知识开发的特定启发式方法。我们提出了一种基于深度学习的框架,称为HybridNet,将具有不同结构的 Graph 编码器和循环调度生成器相结合,以在时间限制上 upper 和 lower-bound 范围内调度随机人类机器人团队。HybridNet 的编码器利用不同结构的 Graph 注意力网络模型建模初始环境和团队动态,同时考虑限制。通过将任务调度作为顺序决策过程定义,HybridNet 的循环神经网络调度生成器利用LSTM模型传播行动的后果,以快速生成计划,消除了在每个任务-机器人对选之间的环境交互所需的必要。 resulting 的调度政策网络提供了计算量较轻但表达力很高的模型,通过强化学习算法可End-to-End训练。我们为混合人类机器人团队开发了一个多轮环境中的虚拟任务调度环境,能够建模人类工人的随机学习行为。实验结果显示,HybridNet 在处理确定人类表现和随机人类表现的问题时在所有问题大小上优于其他人类机器人调度解决方案,与纯GNN调度器的运行时相比,它的速度更快。
URL
https://arxiv.org/abs/2301.13279