Abstract
Ensembling neural machine translation (NMT) models to produce higher-quality translations than the $L$ individual models has been extensively studied. Recent methods typically employ a candidate selection block (CSB) and an encoder-decoder fusion block (FB), requiring inference across \textit{all} candidate models, leading to significant computational overhead, generally $\Omega(L)$. This paper introduces \textbf{SmartGen}, a reinforcement learning (RL)-based strategy that improves the CSB by selecting a small, fixed number of candidates and identifying optimal groups to pass to the fusion block for each input sentence. Furthermore, previously, the CSB and FB were trained independently, leading to suboptimal NMT performance. Our DQN-based \textbf{SmartGen} addresses this by using feedback from the FB block as a reward during training. We also resolve a key issue in earlier methods, where candidates were passed to the FB without modification, by introducing a Competitive Correction Block (CCB). Finally, we validate our approach with extensive experiments on English-Hindi translation tasks in both directions.
Abstract (translated)
将神经机器翻译(NMT)模型进行集成以产生比单独的$L$个模型更高的质量翻译已经被广泛研究。最近的方法通常使用候选选择模块(CSM)和编码器-解码器融合块(FB),这需要在所有候选模型上执行推理,导致了显著的计算开销,一般为$\Omega(L)$级别。本文介绍了基于强化学习(RL)策略的\textbf{SmartGen}方法,通过选取一小部分固定数量的候选模型,并确定最优组合以传递给融合模块来优化CSM。此外,之前的方法中CSM和FB是独立训练的,导致了次优的NMT性能。我们的DQN基线\textbf{SmartGen}通过在训练过程中使用来自FB块的反馈作为奖励解决了这个问题。我们还解决了一个早期方法中的关键问题,即候选模型直接传递给FB而未进行任何修改,为此引入了竞争校正模块(CCB)。最后,我们在英语-印地语翻译任务中进行了广泛的实验验证了我们的方法的有效性,包括双向翻译情况。
URL
https://arxiv.org/abs/2501.15219