Abstract
This paper describes the MeMAD project entry to the WMT Multimodal Machine Translation Shared Task. We propose adapting the Transformer neural machine translation (NMT) architecture to a multi-modal setting. In this paper, we also describe the preliminary experiments with text-only translation systems leading us up to this choice. We have the top scoring system for English-to-German and fifth for English-to-French, according to the automatic metrics for flickr18. Our experiments show that the effect of the visual features in our system is small. Our largest gains come from the quality of the underlying text-only NMT system. We find that appropriate use of additional data is effective.
Abstract (translated)
本文描述了WMT多模式机器翻译共享任务的MeMAD项目入口。 我们建议将变压器神经机器翻译(NMT)架构适应多模态设置。在本文中,我们还描述了使用纯文本翻译系统的初步实验,这使我们能够选择这种方法。 根据flickr18的自动指标,我们有英语到德语的最高评分系统和英语到法语的第五评分系统。 我们的实验表明,我们系统中视觉特征的影响很小。我们最大的收益来自基础纯文本NMT系统的质量。我们发现适当使用其他数据是有效的。
URL
https://arxiv.org/abs/1808.10802