Abstract
Pronoun translation is a longstanding challenge in neural machine translation (NMT), often requiring inter-sentential context to ensure linguistic accuracy. To address this, we introduce ProNMT, a novel framework designed to enhance pronoun and overall translation quality in context-aware machine translation systems. ProNMT leverages Quality Estimation (QE) models and a unique Pronoun Generation Likelihood-Based Feedback mechanism to iteratively fine-tune pre-trained NMT models without relying on extensive human annotations. The framework combines QE scores with pronoun-specific rewards to guide training, ensuring improved handling of linguistic nuances. Extensive experiments demonstrate significant gains in pronoun translation accuracy and general translation quality across multiple metrics. ProNMT offers an efficient, scalable, and context-aware approach to improving NMT systems, particularly in translating context-dependent elements like pronouns.
Abstract (translated)
代词翻译一直是神经机器翻译(NMT)领域的一个长期挑战,通常需要跨句子的上下文信息来确保语言准确性。为了解决这一问题,我们引入了ProNMT——一个旨在通过利用质量估计(QE)模型和基于生成可能性的独特反馈机制,在不依赖大量人工标注的情况下,迭代地优化预训练NMT模型以提高代词翻译质量和整体翻译准确性的新型框架。该框架结合了QE分数与特定于代词的奖励来指导训练过程,从而更好地处理语言中的细微差别。 广泛实验表明,ProNMT在多项指标上显著提高了代词翻译准确性以及总体翻译质量。这一方法提供了一种高效、可扩展且上下文感知的方式来改进NMT系统,特别是在翻译依赖于上下文的语言元素(如代词)时表现尤为突出。
URL
https://arxiv.org/abs/2501.03008