Reinforced Language Modeling for End-to-End Task Oriented Dialog

Abstract
Abstract (translated)
URL
PDF

Abstract

In task-oriented dialogs such as MultiWoZ (Budzianowski et al., 2018), an informative and/or successful system response needs to include necessary key information such as the phone number of a hotel. Therefore, we hypothesize that by helping the model to focus more on learning key quantities in the dialog, the model can generative more informative and helpful responses. In this paper, we propose a new training algorithm, Reinforced Language Modeling (RLM), that aims to use a fine-grained reward function and reinforcement learning to help the model focus more on generating key quantities correctly during test time. Empirical results show our proposed RLM achieves state-of-the-art performance on the inform rate, success rate, and combined score in MultiWoZ.

Abstract (translated)

URL

https://arxiv.org/abs/2211.16773

PDF

https://arxiv.org/pdf/2211.16773.pdf

Reinforced Language Modeling for End-to-End Task Oriented Dialog

Abstract

Abstract (translated)

URL

PDF Copy

PDF