Paper Reading AI Learner

MultiSlav: Using Cross-Lingual Knowledge Transfer to Combat the Curse of Multilinguality

2025-02-20 12:35:25
Artur Kot, Miko{\l}aj Koszowski, Wojciech Chojnowski, Mieszko Rutkowski, Artur Nowakowski, Kamil Guttmann, Miko{\l}aj Pokrywka

Abstract

Does multilingual Neural Machine Translation (NMT) lead to The Curse of the Multlinguality or provides the Cross-lingual Knowledge Transfer within a language family? In this study, we explore multiple approaches for extending the available data-regime in NMT and we prove cross-lingual benefits even in 0-shot translation regime for low-resource languages. With this paper, we provide state-of-the-art open-source NMT models for translating between selected Slavic languages. We released our models on the HuggingFace Hub (this https URL) under the CC BY 4.0 license. Slavic language family comprises morphologically rich Central and Eastern European languages. Although counting hundreds of millions of native speakers, Slavic Neural Machine Translation is under-studied in our opinion. Recently, most NMT research focuses either on: high-resource languages like English, Spanish, and German - in WMT23 General Translation Task 7 out of 8 task directions are from or to English; massively multilingual models covering multiple language groups; or evaluation techniques.

Abstract (translated)

多语言神经机器翻译(NMT)会导致“多语种的诅咒”还是能够实现语言家族内的跨语言知识转移?在本研究中,我们探讨了多种方法来扩展NMT的数据范围,并证明即使是低资源语言,在零样本翻译模式下也能获得跨语言的好处。通过本文,我们提供了最先进的开源NMT模型,用于选定的斯拉夫语之间的互译。我们在HuggingFace Hub(此链接)以CC BY 4.0许可证发布了我们的模型。斯拉夫语族包括中欧和东欧形态丰富的语言。尽管有数亿母语使用者,但据我们所知,斯拉夫神经机器翻译研究仍处于不足状态。最近的大多数NMT研究要么集中在资源丰富如英语、西班牙语和德语的语言上——在WMT23通用翻译任务中的8个方向中有7个是与英语相关的;覆盖多种语言群体的大规模多语言模型;或者评估技术。

URL

https://arxiv.org/abs/2502.14509

PDF

https://arxiv.org/pdf/2502.14509.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot