Why are Sequence-to-Sequence Models So Dull? Understanding the Low-Diversity Problem of Chatbots

2018-09-06 12:24:04

Shaojie Jiang, Maarten de Rijke

arXiv_AI

arXiv_AI Review QA Sketch Dialog Chat

Abstract
Abstract (translated)
URL
PDF

Abstract

Diversity is a long-studied topic in information retrieval that usually refers to the requirement that retrieved results should be non-repetitive and cover different aspects. In a conversational setting, an additional dimension of diversity matters: an engaging response generation system should be able to output responses that are diverse and interesting. Sequence-to-sequence (Seq2Seq) models have been shown to be very effective for response generation. However, dialogue responses generated by Seq2Seq models tend to have low diversity. In this paper, we review known sources and existing approaches to this low-diversity problem. We also identify a source of low diversity that has been little studied so far, namely model over-confidence. We sketch several directions for tackling model over-confidence and, hence, the low-diversity problem, including confidence penalties and label smoothing.

Abstract (translated)

多样性是信息检索中长期研究的主题，通常指的是检索结果应该是非重复性的并且涵盖不同方面的要求。在对话环境中，多样性的另一个维度很重要：一个引人入胜的响应生成系统应该能够输出多样化和有趣的响应。序列到序列（Seq2Seq）模型已被证明对响应生成非常有效。然而，Seq2Seq模型产生的对话响应往往具有较低的多样性。在本文中，我们回顾了这种低多样性问题的已知来源和现有方法。我们还确定了迄今为止研究较少的低多样性来源，即模型过度自信。我们勾勒出几个方向来解决模型过度自信，因此，低多样性问题，包括置信度惩罚和标签平滑。

URL

https://arxiv.org/abs/1809.01941

PDF

https://arxiv.org/pdf/1809.01941.pdf