Abstract
Despite the remarkable advancements in machine translation, the current sentence-level paradigm faces challenges when dealing with highly-contextual languages like Japanese. In this paper, we explore how context-awareness can improve the performance of the current Neural Machine Translation (NMT) models for English-Japanese business dialogues translation, and what kind of context provides meaningful information to improve translation. As business dialogue involves complex discourse phenomena but offers scarce training resources, we adapted a pretrained mBART model, finetuning on multi-sentence dialogue data, which allows us to experiment with different contexts. We investigate the impact of larger context sizes and propose novel context tokens encoding extra-sentential information, such as speaker turn and scene type. We make use of Conditional Cross-Mutual Information (CXMI) to explore how much of the context the model uses and generalise CXMI to study the impact of the extra-sentential context. Overall, we find that models leverage both preceding sentences and extra-sentential context (with CXMI increasing with context size) and we provide a more focused analysis on honorifics translation. Regarding translation quality, increased source-side context paired with scene and speaker information improves the model performance compared to previous work and our context-agnostic baselines, measured in BLEU and COMET metrics.
Abstract (translated)
尽管机器翻译取得了显著的进步,但针对具有高度上下文的语言(如日本语)的句子级别范式在处理商务对话时面临挑战。在本文中,我们探讨了上下文意识如何改善当前的神经机器翻译(NMT)模型在英语-日语商务对话翻译中的性能,以及何种上下文可以提供有意义的信息来提高翻译。商务对话涉及复杂的会话现象,但提供了稀少的训练资源。因此,我们调整了一个预训练的 mBART 模型,在多句对话数据上进行微调,允许我们实验不同上下文。我们研究了更大上下文大小对翻译性能的影响,并提出了新颖的上下文标记编码额外的会话信息,如说话者回合和场景类型。我们利用条件互信息(CXMI)来探讨模型使用了多少上下文,并将 CXMI 扩展到研究额外的会话上下文的影响。总体而言,我们发现模型利用了先前的句子和额外的上下文(随着上下文大小的增加,CXMI 增加),并且我们对敬语翻译进行了深入分析。关于翻译质量,与场景和说话者信息相结合的增加源端上下文提高了模型的性能,与之前的工作和我们的上下文无关基准相比。
URL
https://arxiv.org/abs/2311.11976