Abstract
Recent neural models for data-to-text generation are mostly based on data-driven end-to-end training over encoder-decoder networks. Even though the generated texts are mostly fluent and informative, they often generate descriptions that are not consistent with the input structured data. This is a critical issue especially in domains that require inference or calculations over raw data. In this paper, we attempt to improve the fidelity of neural data-to-text generation by utilizing pre-executed symbolic operations. We propose a framework called Operation-guided Attention-based sequence-to-sequence network (OpAtt), with a specifically designed gating mechanism as well as a quantization module for operation results to utilize information from pre-executed operations. Experiments on two sports datasets show our proposed method clearly improves the fidelity of the generated texts to the input structured data.
Abstract (translated)
最近用于数据到文本生成的神经模型主要基于编码器 - 解码器网络上的数据驱动的端到端训练。尽管生成的文本大多是流畅且信息丰富的,但它们通常会生成与输入结构化数据不一致的描述。这是一个关键问题,尤其是在需要对原始数据进行推理或计算的域中。在本文中,我们尝试通过利用预执行的符号操作来提高神经数据到文本生成的保真度。我们提出了一个称为基于操作引导注意的序列到序列网络(OpAtt)的框架,其具有专门设计的选通机制以及用于操作结果的量化模块,以利用来自预执行操作的信息。对两个体育数据集的实验表明,我们提出的方法明显提高了生成的文本对输入结构化数据的保真度。
URL
https://arxiv.org/abs/1809.02735