Abstract
In this article, we sketch an algorithm that extends the Q-learning algorithms to the continuous action space domain. Our method is based on the discretization of the action space. Despite the commonly used discretization methods, our method does not increase the discretized problem dimensionality exponentially. We will show that our proposed method is linear in complexity when the discretization is employed. The variant of the Q-learning algorithm presented in this work, labeled as Finite Step Q-Learning (FSQ), can be deployed to both shallow and deep neural network architectures.
Abstract (translated)
在本文中,我们绘制了一种算法,将Q学习算法扩展到连续动作空间域。我们的方法基于动作空间的离散化。尽管存在常用的离散化方法,但我们的方法并未以指数方式增加离散问题维数。我们将证明,当采用离散化时,我们提出的方法在复杂性上是线性的。本工作中提出的Q学习算法的变体,标记为有限步Q-Learning(FSQ),可以部署到浅层和深层神经网络架构。
URL
https://arxiv.org/abs/1807.06957