Abstract
So-called implicit behavioral cloning with energy-based models has shown promising results in robotic manipulation tasks. We tested if the method's advantages carry on to controlling the steering of a real self-driving car with an end-to-end driving model. We performed an extensive comparison of the implicit behavioral cloning approach with explicit baseline approaches, all sharing the same neural network backbone architecture. Baseline explicit models were trained with regression (MAE) loss, classification loss (softmax and cross-entropy on a discretization), or as mixture density networks (MDN). While models using the energy-based formulation performed comparably to baseline approaches in terms of safety driver interventions, they had a higher whiteness measure, indicating higher jerk. To alleviate this, we show two methods that can be used to improve the smoothness of steering. We confirmed that energy-based models handle multimodalities slightly better than simple regression, but this did not translate to significantly better driving ability. We argue that the steering-only road-following task has too few multimodalities to benefit from energy-based models. This shows that applying implicit behavioral cloning to real-world tasks can be challenging, and further investigation is needed to bring out the theoretical advantages of energy-based models.
Abstract (translated)
所谓的基于能源模型的隐含行为复制在机器人操纵任务中取得了令人瞩目的结果。我们测试了这种方法是否能够通过end-to-end驾驶模型来控制真实的自主汽车的方向舵。我们进行了广泛的比较,将隐含行为复制方法和 explicit baseline方法进行了对比,这些方法都共享相同的神经网络主干架构。 explicit baseline方法通过回归(MAE)损失、分类损失(在离散化时softmax和交叉熵)或混合密度网络(MDN)训练。尽管使用基于能源的 formulation 的模型在安全性 driver 干预方面与 explicit baseline方法表现相似,但它们的亮度测量更高,表明更高的抖动。为了减轻这种情况,我们展示了两种方法,这些方法可以用来改善方向舵的平滑性。我们证实了基于能源模型处理多模式比简单的回归更好,但这并不意味着更好的驾驶能力。我们指出,只有方向舵的唯一跟随任务缺乏多模式,因此基于能源模型的方法无法从中获得好处。这表明将隐含行为复制方法应用于实际任务可能会面临挑战,需要更多的研究来揭示基于能源模型的理论基础优势。
URL
https://arxiv.org/abs/2301.12264