Reinforcement Learning Methods for Wordle: A POMDP/Adaptive Control Approach

2022-11-15 03:46:41

Siddhant Bhambri, Amrita Bhattacharjee, Dimitri Bertsekas

arXiv_AI

arXiv_AI Reinforcement_Learning

Abstract
Abstract (translated)
URL
PDF

Abstract

In this paper we address the solution of the popular Wordle puzzle, using new reinforcement learning methods, which apply more generally to adaptive control of dynamic systems and to classes of Partially Observable Markov Decision Process (POMDP) problems. These methods are based on approximation in value space and the rollout approach, admit a straightforward implementation, and provide improved performance over various heuristic approaches. For the Wordle puzzle, they yield on-line solution strategies that are very close to optimal at relatively modest computational cost. Our methods are viable for more complex versions of Wordle and related search problems, for which an optimal strategy would be impossible to compute. They are also applicable to a wide range of adaptive sequential decision problems that involve an unknown or frequently changing environment whose parameters are estimated on-line.

Abstract (translated)

URL

https://arxiv.org/abs/2211.10298

PDF

https://arxiv.org/pdf/2211.10298.pdf