Paper Reading AI Learner

A Study of Reinforcement Learning for Neural Machine Translation

2018-08-27 14:43:38
Lijun Wu, Fei Tian, Tao Qin, Jianhuang Lai, Tie-Yan Liu

Abstract

Recent studies have shown that reinforcement learning (RL) is an effective approach for improving the performance of neural machine translation (NMT) system. However, due to its instability, successfully RL training is challenging, especially in real-world systems where deep models and large datasets are leveraged. In this paper, taking several large-scale translation tasks as testbeds, we conduct a systematic study on how to train better NMT models using reinforcement learning. We provide a comprehensive comparison of several important factors (e.g., baseline reward, reward shaping) in RL training. Furthermore, to fill in the gap that it remains unclear whether RL is still beneficial when monolingual data is used, we propose a new method to leverage RL to further boost the performance of NMT systems trained with source/target monolingual data. By integrating all our findings, we obtain competitive results on WMT14 English- German, WMT17 English-Chinese, and WMT17 Chinese-English translation tasks, especially setting a state-of-the-art performance on WMT17 Chinese-English translation task.

Abstract (translated)

最近的研究表明,强化学习(RL)是提高神经机器翻译(NMT)系统性能的有效方法。然而,由于其不稳定性,成功的RL培训具有挑战性,特别是在利用深度模型和大型数据集的现实世界系统中。本文以几个大型翻译任务为测试平台,对如何利用强化学习训练更好的NMT模型进行系统研究。我们对RL培训中的几个重要因素(例如基线奖励,奖励塑造)进行了全面比较。此外,为了填补在使用单语数据时RL是否仍然有益的差距,我们提出了一种利用RL来进一步提高用源/目标单语数据训练的NMT系统的性能的新方法。通过整合我们的所有发现,我们获得了WMT14英语 - 德语,WMT17英语 - 汉语和WMT17汉英翻译任务的竞争结果,特别是在WMT17汉英翻译任务中设置了最先进的表现。

URL

https://arxiv.org/abs/1808.08866

PDF

https://arxiv.org/pdf/1808.08866.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot