Joint System-Wise Optimization for Pipeline Goal-Oriented Dialog System

2021-06-09 06:44:57

Zichuan Lin, Jing Huang, Bowen Zhou, Xiaodong He, Tengyu Ma

arXiv_CL

Abstract
Abstract (translated)
URL
PDF

Abstract

Recent work (Takanobu et al., 2020) proposed the system-wise evaluation on dialog systems and found that improvement on individual components (e.g., NLU, policy) in prior work may not necessarily bring benefit to pipeline systems in system-wise evaluation. To improve the system-wise performance, in this paper, we propose new joint system-wise optimization techniques for the pipeline dialog system. First, we propose a new data augmentation approach which automates the labeling process for NLU training. Second, we propose a novel stochastic policy parameterization with Poisson distribution that enables better exploration and offers a principled way to compute policy gradient. Third, we propose a reward bonus to help policy explore successful dialogs. Our approaches outperform the competitive pipeline systems from Takanobu et al. (2020) by big margins of 12% success rate in automatic system-wise evaluation and of 16% success rate in human evaluation on the standard multi-domain benchmark dataset MultiWOZ 2.1, and also outperform the recent state-of-the-art end-to-end trained model from DSTC9.

Abstract (translated)

URL

https://arxiv.org/abs/2106.04835

PDF

https://arxiv.org/pdf/2106.04835.pdf