Paper Reading AI Learner

DebiasingWord Embeddings Improves Multimodal Machine Translation

2019-05-24 22:11:57
Tosho Hirasawa, Mamoru Komachi

Abstract

In recent years, pretrained word embeddings have proved useful for multimodal neural machine translation (NMT) models to address the shortage of available datasets. However, the integration of pretrained word embeddings has not yet been explored extensively. Further, pretrained word embeddings in high dimensional spaces have been reported to suffer from the hubness problem. Although some debiasing techniques have been proposed to address this problem for other natural language processing tasks, they have seldom been studied for multimodal NMT models. In this study, we examine various kinds of word embeddings and introduce two debiasing techniques for three multimodal NMT models and two language pairs -- English-German translation and English-French translation. With our optimal settings, the overall performance of multimodal models was improved by up to +1.93 BLEU and +2.02 METEOR for English-German translation and +1.73 BLEU and +0.95 METEOR for English-French translation.

Abstract (translated)

近年来,预先训练的字嵌入已被证明是有用的多模态神经机器翻译(NMT)模型,以解决现有数据集的不足。然而,预训练词嵌入的整合还没有得到广泛的探索。此外,据报道,在高维空间中预训练的单词嵌入会产生Hubness问题。尽管有人提出了一些借记技术来解决其他自然语言处理任务中的这个问题,但对于多模NMT模型很少研究这些技术。在本研究中,我们研究了各种类型的单词嵌入,并介绍了三种多模态NMT模型和两种语言对的两种借记技术——英语-德语翻译和英语-法语翻译。在我们的优化设置下,多模态模型的整体性能提高了高达+1.93 Bleu和+2.02 Meteor(英语-德语翻译)和+1.73 Bleu和+0.95 Meteor(英语-法语翻译)。

URL

https://arxiv.org/abs/1905.10464

PDF

https://arxiv.org/pdf/1905.10464.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot