Amortized Q-learning with Model-based Action Proposals for Autonomous Driving on Highways

2020-12-06 11:04:40

Branka Mirchevska, Maria Hügle, Gabriel Kalweit, Moritz Werling, Joschka Boedecker

arXiv_RO

arXiv_RO Reinforcement_Learning Optimization Autonomous Action Agent

Abstract
Abstract (translated)
URL
PDF

Abstract

Well-established optimization-based methods can guarantee an optimal trajectory for a short optimization horizon, typically no longer than a few seconds. As a result, choosing the optimal trajectory for this short horizon may still result in a sub-optimal long-term solution. At the same time, the resulting short-term trajectories allow for effective, comfortable and provable safe maneuvers in a dynamic traffic environment. In this work, we address the question of how to ensure an optimal long-term driving strategy, while keeping the benefits of classical trajectory planning. We introduce a Reinforcement Learning based approach that coupled with a trajectory planner, learns an optimal long-term decision-making strategy for driving on highways. By online generating locally optimal maneuvers as actions, we balance between the infinite low-level continuous action space, and the limited flexibility of a fixed number of predefined standard lane-change actions. We evaluated our method on realistic scenarios in the open-source traffic simulator SUMO and were able to achieve better performance than the 4 benchmark approaches we compared against, including a random action selecting agent, greedy agent, high-level, discrete actions agent and an IDM-based SUMO-controlled agent.

Abstract (translated)

URL

https://arxiv.org/abs/2012.03234

PDF

https://arxiv.org/pdf/2012.03234.pdf