Abstract
In NMT, words are sometimes dropped from the source or generated repeatedly in the translation. We explore novel strategies to address the coverage problem that change only the attention transformation. Our approach allocates fertilities to source words, used to bound the attention each word can receive. We experiment with various sparse and constrained attention transformations and propose a new one, constrained sparsemax, shown to be differentiable and sparse. Empirical evaluation is provided in three languages pairs.
Abstract (translated)
在NMT中,单词有时会从源代码中删除,或者在翻译中重复生成。我们探索新的策略来解决仅改变注意力转换的覆盖问题。我们的方法将生育能力分配给源单词,用于限制每个单词可以接收的注意力。我们尝试了各种稀疏和有限的注意力转换,并提出了一个新的约束sparsemax,显示为可微和稀疏。以三种语言对提供实证评估。
URL
https://arxiv.org/abs/1805.08241