Paper Reading AI Learner

Context-aware Neural Machine Translation for English-Japanese Business Scene Dialogues

2023-11-20 18:06:03
Sumire Honda, Patrick Fernandes, Chrysoula Zerva

Abstract

Despite the remarkable advancements in machine translation, the current sentence-level paradigm faces challenges when dealing with highly-contextual languages like Japanese. In this paper, we explore how context-awareness can improve the performance of the current Neural Machine Translation (NMT) models for English-Japanese business dialogues translation, and what kind of context provides meaningful information to improve translation. As business dialogue involves complex discourse phenomena but offers scarce training resources, we adapted a pretrained mBART model, finetuning on multi-sentence dialogue data, which allows us to experiment with different contexts. We investigate the impact of larger context sizes and propose novel context tokens encoding extra-sentential information, such as speaker turn and scene type. We make use of Conditional Cross-Mutual Information (CXMI) to explore how much of the context the model uses and generalise CXMI to study the impact of the extra-sentential context. Overall, we find that models leverage both preceding sentences and extra-sentential context (with CXMI increasing with context size) and we provide a more focused analysis on honorifics translation. Regarding translation quality, increased source-side context paired with scene and speaker information improves the model performance compared to previous work and our context-agnostic baselines, measured in BLEU and COMET metrics.

Abstract (translated)

尽管机器翻译取得了显著的进步,但针对具有高度上下文的语言(如日本语)的句子级别范式在处理商务对话时面临挑战。在本文中,我们探讨了上下文意识如何改善当前的神经机器翻译(NMT)模型在英语-日语商务对话翻译中的性能,以及何种上下文可以提供有意义的信息来提高翻译。商务对话涉及复杂的会话现象,但提供了稀少的训练资源。因此,我们调整了一个预训练的 mBART 模型,在多句对话数据上进行微调,允许我们实验不同上下文。我们研究了更大上下文大小对翻译性能的影响,并提出了新颖的上下文标记编码额外的会话信息,如说话者回合和场景类型。我们利用条件互信息(CXMI)来探讨模型使用了多少上下文,并将 CXMI 扩展到研究额外的会话上下文的影响。总体而言,我们发现模型利用了先前的句子和额外的上下文(随着上下文大小的增加,CXMI 增加),并且我们对敬语翻译进行了深入分析。关于翻译质量,与场景和说话者信息相结合的增加源端上下文提高了模型的性能,与之前的工作和我们的上下文无关基准相比。

URL

https://arxiv.org/abs/2311.11976

PDF

https://arxiv.org/pdf/2311.11976.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot