Abstract
Generative models for dialog system have gained a lot of interest because of the success of recent RNN and Transformer based models in complex natural language tasks like question answering and summarization. Although, the task of dialog response generation is generally seen as a sequence to sequence (Seq2Seq) problem, researchers in the past have found it challenging to train dialog systems using the standard Seq2Seq models. Therefore, to help the model learn important utterance and conversation level features Sordoni et al.(2015); Serban et al. (2016) proposed Hierarchical RNN architecture, which was later adopted by several other RNN based dialog systems. With the transformer based models dominating the seq2seq problems lately, a natural question is to understand the applicability of the notion of hierarchy in transformer based dialog systems. In this paper, we show how a standard transformer can be morphed into a hierarchical one by using specially designed attention masks and positional embeddings. Our experiments show strong improvements in context-to-response generation performance for task-oriented dialog system over the current state-of-the-art approaches.
Abstract (translated)
URL
https://arxiv.org/abs/2011.08067