Abstract
Latent variable modeling in non-autoregressive neural machine translation (NAT) is a promising approach to mitigate the multimodality problem. In the previous works, they added an auxiliary model to estimate the posterior distribution of the latent variable conditioned on the source and target sentences. However, it causes several disadvantages, such as redundant information extraction in the latent variable, increasing parameters, and a tendency to ignore a part of the information from the inputs. In this paper, we propose a new latent variable modeling that is based on a dual reconstruction perspective and an advanced hierarchical latent modeling approach. Our proposed method, {\em LadderNMT}, shares a latent space across both languages so that it hypothetically alleviates or solves the above disadvantages. Experimental results quantitatively and qualitatively demonstrate that our proposed latent variable modeling learns an advantageous latent space and significantly improves translation quality in WMT translation tasks.
Abstract (translated)
在非自回归神经网络翻译(NAT)中,latent变量建模是一种缓解多模式问题有前途的方法。在以前的研究中,他们引入了一个辅助模型来估计条件化latent变量的后概率分布。然而,它引起了多个缺点,例如在latent变量中冗余信息的提取,增加参数,以及从输入中忽略部分信息的倾向。在本文中,我们提出了一种新的latent变量建模方法,基于双重重构视角和高级级联latent建模方法。我们提出的方法名为{\em LadderNMT},在两种语言之间共享latent空间,因此可以 hypothetically减轻或解决上述缺点。实验结果定量和定性上都表明,我们提出的latent变量建模学习有利的latent空间,在WMT翻译任务中显著改善了翻译质量。
URL
https://arxiv.org/abs/2305.03511