Abstract
Model Predictive Control (MPC) is attracting tremendous attention in the autonomous driving task as a powerful control technique. The success of an MPC controller strongly depends on an accurate internal dynamics model. However, the static parameters, usually learned by system identification, often fail to adapt to both internal and external perturbations in real-world scenarios. In this paper, we firstly (1) reformulate the problem as a Partially Observed Markov Decision Process (POMDP) that absorbs the uncertainties into observations and maintains Markov property into hidden states; and (2) learn a recurrent policy continually adapting the parameters of the dynamics model via Recurrent Reinforcement Learning (RRL) for optimal and adaptive control; and (3) finally evaluate the proposed algorithm (referred as $\textit{MPC-RRL}$) in CARLA simulator and leading to robust behaviours under a wide range of perturbations.
Abstract (translated)
Model Predictive Control (MPC)在自动驾驶任务中吸引了巨大的注意力,它是一种强大的控制技术。一个成功的 MPC 控制器在很大程度上取决于准确的内部动力学模型。然而,通常通过系统识别来学习的静态参数在真实场景下往往无法适应内部和外部干扰。在本文中,我们首先(1)将问题重新表述为一种部分观察到的马凡诺决策过程(POMDP),该过程吸收不确定性并将其存储在隐状态中;(2)学习一种循环反馈控制器,通过循环强化学习(RRL)不断适应动力学模型的参数,以进行最优和自适应控制;(3)最后,在 CarLA 模拟器中评估了我们所提出的算法(称为 MPC-RRL),并导致了在各种干扰条件下的稳健行为。
URL
https://arxiv.org/abs/2301.13313