Abstract
An emerging research direction in NMT involves the use of Quality Estimation (QE) models, which have demonstrated high correlations with human judgment and can enhance translations through Quality-Aware Decoding. Although several approaches have been proposed based on sampling multiple candidate translations, none have integrated these models directly into the decoding process. In this paper, we address this by proposing a novel token-level QE model capable of reliably scoring partial translations. We build a uni-directional QE model for this, as decoder models are inherently trained and efficient on partial sequences. We then present a decoding strategy that integrates the QE model for Quality-Aware decoding and demonstrate that the translation quality improves when compared to the N-best list re-ranking with state-of-the-art QE models (upto $1.39$ XCOMET-XXL $\uparrow$). Finally, we show that our approach provides significant benefits in document translation tasks, where the quality of N-best lists is typically suboptimal.
Abstract (translated)
在神经机器翻译(NMT)领域,一个新兴的研究方向是使用质量估计(QE)模型。这些模型已经展示了与人类判断的高度相关性,并且可以通过质量感知解码来提升翻译效果。尽管已有多种方法提出了基于采样多个候选翻译的方案,但还没有任何一种方法将这些模型直接集成到解码过程中。在本文中,我们通过提出一个新颖的、能够在可靠地评分部分翻译上表现优异的分词级别QE模型解决了这一问题。为此,我们构建了一个单向QE模型,因为译码器模型本质上适用于并能高效处理不完整的序列。接着,我们介绍了一种解码策略,该策略将QE模型集成到质量感知解码中,并展示了与使用最先进的QE模型对N-best列表重新排序相比,我们的方法可以提升翻译质量(最多提高1.39个XCOMET-XXL分数)。最后,我们证明了在文档翻译任务中,这种方法提供了显著的优势,在这种情况下,N-best列表的质量通常不理想。
URL
https://arxiv.org/abs/2502.08561