Regularizing Neural Machine Translation by Target-bidirectional Agreement

2018-08-13 05:03:42

Zhirui Zhang, Shuangzhi Wu, Shujie Liu, Mu Li, Ming Zhou, Enhong Chen

arXiv_CL

Abstract
Abstract (translated)
URL
PDF

Abstract

Although Neural Machine Translation (NMT) has achieved remarkable progress in the past several years, most NMT systems still suffer from a fundamental shortcoming as in other sequence generation tasks: errors made early in generation process are fed as inputs to the model and can be quickly amplified, harming subsequent sequence generation. To address this issue, we propose a novel model regularization method for NMT training, which aims to improve the agreement between translations generated by left-to-right (L2R) and right-to-left (R2L) NMT decoders. This goal is achieved by introducing two Kullback-Leibler divergence regularization terms into the NMT training objective to reduce the mismatch between output probabilities of L2R and R2L models. In addition, we also employ a joint training strategy to allow L2R and R2L models to improve each other in an interactive update process. Experimental results show that our proposed method significantly outperforms state-of-the-art baselines on Chinese-English and English-German translation tasks.

Abstract (translated)

尽管神经机器翻译（NMT）在过去几年取得了显着进步，但大多数NMT系统仍然存在与其他序列生成任务一样的根本缺点：在生成过程中早期产生的错误作为模型的输入提供，并且可以快速放大，损害后续序列的产生。为了解决这个问题，我们提出了一种新的NMT训练模型正则化方法，旨在提高从左到右（L2R）和从右到左（R2L）NMT解码器生成的翻译之间的一致性。通过将两个Kullback-Leibler发散正则化项引入NMT训练目标以减少L2R和R2L模型的输出概率之间的不匹配来实现该目标。此外，我们还采用联合培训策略，允许L2R和R2L模型在交互式更新过程中相互改进。实验结果表明，我们提出的方法在中英文和英德翻译任务方面明显优于最先进的基线。

URL

https://arxiv.org/abs/1808.04064

PDF

https://arxiv.org/pdf/1808.04064.pdf