Lifelong Reinforcement Learning with Temporal Logic Formulas and Reward Machines

2021-11-18 02:02:08

Xuejing Zheng, Chao Yu, Chen Chen, Jianye Hao, Hankz Hankui Zhuo

arXiv_AI

Abstract
Abstract (translated)
URL
PDF

Abstract

Continuously learning new tasks using high-level ideas or knowledge is a key capability of humans. In this paper, we propose Lifelong reinforcement learning with Sequential linear temporal logic formulas and Reward Machines (LSRM), which enables an agent to leverage previously learned knowledge to fasten learning of logically specified tasks. For the sake of more flexible specification of tasks, we first introduce Sequential Linear Temporal Logic (SLTL), which is a supplement to the existing Linear Temporal Logic (LTL) formal language. We then utilize Reward Machines (RM) to exploit structural reward functions for tasks encoded with high-level events, and propose automatic extension of RM and efficient knowledge transfer over tasks for continuous learning in lifetime. Experimental results show that LSRM outperforms the methods that learn the target tasks from scratch by taking advantage of the task decomposition using SLTL and knowledge transfer over RM during the lifelong learning process.

Abstract (translated)

URL

https://arxiv.org/abs/2111.09475

PDF

https://arxiv.org/pdf/2111.09475.pdf