Paper Reading AI Learner

Unsupervised Statistical Machine Translation

2018-09-04 23:22:28
Mikel Artetxe, Gorka Labaka, Eneko Agirre

Abstract

While modern machine translation has relied on large parallel corpora, a recent line of work has managed to train Neural Machine Translation (NMT) systems from monolingual corpora only (Artetxe et al., 2018c; Lample et al., 2018). Despite the potential of this approach for low-resource settings, existing systems are far behind their supervised counterparts, limiting their practical interest. In this paper, we propose an alternative approach based on phrase-based Statistical Machine Translation (SMT) that significantly closes the gap with supervised systems. Our method profits from the modular architecture of SMT: we first induce a phrase table from monolingual corpora through cross-lingual embedding mappings, combine it with an n-gram language model, and fine-tune hyperparameters through an unsupervised MERT variant. In addition, iterative backtranslation improves results further, yielding, for instance, 14.08 and 26.22 BLEU points in WMT 2014 English-German and English-French, respectively, an improvement of more than 7-10 BLEU points over previous unsupervised systems, and closing the gap with supervised SMT (Moses trained on Europarl) down to 2-5 BLEU points. Our implementation is available at this https URL

Abstract (translated)

虽然现代机器翻译依赖于大型平行语料库,但最近的一系列工作已经设法仅从单语语料库中训练神经机器翻译(NMT)系统(Artetxe等,2018c; Lample等,2018)。尽管这种方法有可能用于低资源环境,但现有系统远远落后于受监管的系统,限制了它们的实际利益。在本文中,我们提出了一种基于短语的统计机器翻译(SMT)的替代方法,该方法显着缩小了与监督系统的差距。我们的方法从SMT的模块化体系结构中获益:我们首先通过跨语言嵌入映射从单语语料库中引入短语表,将其与n-gram语言模型相结合,并通过无监督的MERT变体微调超参数。此外,迭代反向翻译进一步改善了结果,例如,分别在WMT 2014英语 - 德语和英语 - 法语中产生了14.08和26.22个BLEU点,比先前的无监督系统提高了7-10个BLEU点,并且关闭了与受监督的SMT(摩西训练Europarl)的差距低至2-5 BLEU分。我们的实施可通过此https URL获得

URL

https://arxiv.org/abs/1809.01272

PDF

https://arxiv.org/pdf/1809.01272.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot