Abstract
In this work, we present novel approaches to exploit sentential context for neural machine translation (NMT). Specifically, we first show that a shallow sentential context extracted from the top encoder layer only, can improve translation performance via contextualizing the encoding representations of individual words. Next, we introduce a deep sentential context, which aggregates the sentential context representations from all the internal layers of the encoder to form a more comprehensive context representation. Experimental results on the WMT14 English-to-German and English-to-French benchmarks show that our model consistently improves performance over the strong TRANSFORMER model (Vaswani et al., 2017), demonstrating the necessity and effectiveness of exploiting sentential context for NMT.
Abstract (translated)
在这项工作中,我们提出了新的方法来利用句子上下文神经机器翻译(NMT)。具体地说,我们首先表明,仅从顶层编码器层提取的浅句子上下文可以通过上下文化单个单词的编码表示来提高翻译性能。接下来,我们介绍一个深层的句子上下文,它从编码器的所有内部层聚合句子上下文表示,以形成一个更全面的上下文表示。在wmt14英语-德语和英语-法语基准测试中的实验结果表明,我们的模型持续改善了强变压器模型的性能(Vaswani等人,2017年),证明了开发NMT句子上下文的必要性和有效性。
URL
https://arxiv.org/abs/1906.01268