Width-based Lookaheads with Learnt Base Policies and Heuristics Over the Atari-2600 Benchmark

2021-06-23 04:27:55

Stefan O'Toole, Nir Lipovetzky, Miquel Ramirez, Adrian Pearce

arXiv_AI

arXiv_AI Sparse Pose

Abstract
Abstract (translated)
URL
PDF

Abstract

We propose new width-based planning and learning algorithms applied over the Atari-2600 benchmark. The algorithms presented are inspired from a careful analysis of the design decisions made by previous width-based planners. We benchmark our new algorithms over the Atari-2600 games and show that our best performing algorithm, RIW$_C$+CPV, outperforms previously introduced width-based planning and learning algorithms $\pi$-IW(1), $\pi$-IW(1)+ and $\pi$-HIW(n, 1). Furthermore, we present a taxonomy of the set of Atari-2600 games according to some of their defining characteristics. This analysis of the games provides further insight into the behaviour and performance of the width-based algorithms introduced. Namely, for games with large branching factors, and games with sparse meaningful rewards, RIW$_C$+CPV outperforms $\pi$-IW, $\pi$-IW(1)+ and $\pi$-HIW(n, 1).

Abstract (translated)

URL

https://arxiv.org/abs/2106.12151

PDF

https://arxiv.org/pdf/2106.12151.pdf