Abstract
Automating the segregation process is a need for every sector experiencing a high volume of materials handling, repetitive and exhaustive operations, in addition to risky exposures. Learning automated pick-and-place operations can be efficiently done by introducing collaborative autonomous systems (e.g. manipulators) in the workplace and among human operators. In this paper, we propose a deep reinforcement learning strategy to learn the place task of multi-categorical items from a shared workspace between dual-manipulators and to multi-goal destinations, assuming the pick has been already completed. The learning strategy leverages first a stochastic actor-critic framework to train an agent's policy network, and second, a dynamic 3D Gym environment where both static and dynamic obstacles (e.g. human factors and robot mate) constitute the state space of a Markov decision process. Learning is conducted in a Gazebo simulator and experiments show an increase in cumulative reward function for the agent further away from human factors. Future investigations will be conducted to enhance the task performance for both agents simultaneously.
Abstract (translated)
自动化分离过程是每个经历大量材料操作、重复且繁琐的操作,以及高风险暴露的行业的需要。通过在职场和人类操作员之间引入协作自主系统(例如操作器),可以高效地学习自动化选择和放置任务,从双臂操作器共享工作空间的多分类项目中学到。在本文中,我们提出了一个深度强化学习策略,从双臂操作器与多目标之间共享工作空间中学习多分类项目的放置位置,假设选择已经完成。学习策略利用随机演员-评论框架训练代理程序的策略网络,并利用动态3D Gym环境,其中静态和动态障碍(例如人为因素和机器人伴侣)构成一个马尔可夫决策过程的状态空间。学习是通过Gazebo仿真器进行的,实验结果表明,代理远离人类因素时,累积奖励函数增加。未来研究将同时对两个代理进行任务性能的提升。
URL
https://arxiv.org/abs/2404.17673