Goal-Directed Planning by Reinforcement Learning and Active Inference

Abstract
Abstract (translated)
URL
PDF

Abstract

What is the difference between goal-directed and habitual behavior? We propose a novel computational framework of decision making with Bayesian inference, in which everything is integrated as an entire neural network model. The model learns to predict environmental state transitions by self-exploration and generating motor actions by sampling stochastic internal states $z$. Habitual behavior, which is obtained from the prior distribution of $z$, is acquired by reinforcement learning. Goal-directed behavior is determined from the posterior distribution of $z$ by planning, using active inference, to minimize the free energy for goal observation. We demonstrate the effectiveness of the proposed framework by experiments in a sensorimotor navigation task with camera observations and continuous motor actions.

Abstract (translated)

URL

https://arxiv.org/abs/2106.09938

PDF

https://arxiv.org/pdf/2106.09938.pdf