Paper Reading AI Learner

Faster Machine Translation Ensembling with Reinforcement Learning and Competitive Correction

2025-01-25 13:50:18
Kritarth Prasad, Mohammadi Zaki, Pratik Singh, Pankaj Wasnik

Abstract

Ensembling neural machine translation (NMT) models to produce higher-quality translations than the $L$ individual models has been extensively studied. Recent methods typically employ a candidate selection block (CSB) and an encoder-decoder fusion block (FB), requiring inference across \textit{all} candidate models, leading to significant computational overhead, generally $\Omega(L)$. This paper introduces \textbf{SmartGen}, a reinforcement learning (RL)-based strategy that improves the CSB by selecting a small, fixed number of candidates and identifying optimal groups to pass to the fusion block for each input sentence. Furthermore, previously, the CSB and FB were trained independently, leading to suboptimal NMT performance. Our DQN-based \textbf{SmartGen} addresses this by using feedback from the FB block as a reward during training. We also resolve a key issue in earlier methods, where candidates were passed to the FB without modification, by introducing a Competitive Correction Block (CCB). Finally, we validate our approach with extensive experiments on English-Hindi translation tasks in both directions.

Abstract (translated)

将神经机器翻译(NMT)模型进行集成以产生比单独的$L$个模型更高的质量翻译已经被广泛研究。最近的方法通常使用候选选择模块(CSM)和编码器-解码器融合块(FB),这需要在所有候选模型上执行推理,导致了显著的计算开销,一般为$\Omega(L)$级别。本文介绍了基于强化学习(RL)策略的\textbf{SmartGen}方法,通过选取一小部分固定数量的候选模型,并确定最优组合以传递给融合模块来优化CSM。此外,之前的方法中CSM和FB是独立训练的,导致了次优的NMT性能。我们的DQN基线\textbf{SmartGen}通过在训练过程中使用来自FB块的反馈作为奖励解决了这个问题。我们还解决了一个早期方法中的关键问题,即候选模型直接传递给FB而未进行任何修改,为此引入了竞争校正模块(CCB)。最后,我们在英语-印地语翻译任务中进行了广泛的实验验证了我们的方法的有效性,包括双向翻译情况。

URL

https://arxiv.org/abs/2501.15219

PDF

https://arxiv.org/pdf/2501.15219.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot