Abstract
The training paradigm for machine translation has gradually shifted, from learning neural machine translation (NMT) models with extensive parallel corpora to instruction finetuning on pretrained multilingual large language models (LLMs) with high-quality translation pairs. In this paper, we focus on boosting the many-to-many multilingual translation performance of LLMs with an emphasis on zero-shot translation directions. We demonstrate that prompt strategies adopted during instruction finetuning are crucial to zero-shot translation performance and introduce a cross-lingual consistency regularization, XConST, to bridge the representation gap among different languages and improve zero-shot translation performance. XConST is not a new method, but a version of CrossConST (Gao et al., 2023a) adapted for multilingual finetuning on LLMs with translation instructions. Experimental results on ALMA (Xu et al., 2023) and LLaMA-2 (Touvron et al., 2023) show that our approach consistently improves translation performance. Our implementations are available at this https URL.
Abstract (translated)
机器翻译的训练范式逐渐从学习具有广泛并行语料库的神经机器翻译(NMT)模型转向对预训练的多语言大型语言模型(LLM)进行指令微调。在本文中,我们重点提高具有零散翻译方向的多语言LLM的性能,特别关注零散翻译方向。我们证明了在指令微调过程中采用的提示策略对零散翻译性能至关重要,并引入了一种跨语言一致性正则化XConST,以弥合不同语言之间的表示差距,并提高零散翻译性能。XConST不是一种新的方法,而是一种适应于带有翻译指令的多语言LLM的CrossConST的变体。ALMA(Xu et al., 2023)和LLaMA-2(Touvron et al., 2023)的实验结果表明,我们的方法会持续提高翻译性能。我们的实现可以在https://this URL上找到。
URL
https://arxiv.org/abs/2401.05861