Paper Reading AI Learner

CrossIn: An Efficient Instruction Tuning Approach for Cross-Lingual Knowledge Alignment

2024-04-18 06:20:50
Geyu Lin, Bin Wang, Zhengyuan Liu, Nancy F. Chen

Abstract

Multilingual proficiency presents a significant challenge for large language models (LLMs). English-centric models are usually suboptimal in other languages, particularly those that are linguistically distant from English. This performance discrepancy mainly stems from the imbalanced distribution of training data across languages during pre-training and instruction tuning stages. To address this problem, we propose a novel approach called CrossIn, which utilizes a mixed composition of cross-lingual instruction tuning data. Our method leverages the compressed representation shared by various languages to efficiently enhance the model's task-solving capabilities and multilingual proficiency within a single process. In addition, we introduce a multi-task and multi-faceted benchmark to evaluate the effectiveness of CrossIn. Experimental results demonstrate that our method substantially improves performance across tasks and languages, and we provide extensive insights into the impact of cross-lingual data volume and the integration of translation data on enhancing multilingual consistency and accuracy.

Abstract (translated)

多语言能力是一个对大型语言模型(LLMs)来说是一个显著的挑战。英语中心化的模型在其他语言上通常是不优的,特别是那些与英语语言学距离较远的语言。这种性能差异主要源于在预训练和调整阶段语言之间训练数据的不平衡分布。为了解决这个问题,我们提出了一个名为CrossIn的新方法,该方法利用了各种语言之间的跨语言指令调整数据的混合组合。我们的方法利用了各种语言共同拥有的压缩表示来有效地增强模型在单个过程中的任务解决能力和多语言能力。此外,我们还引入了一个多任务和多方面的基准来评估CrossIn的有效性。实验结果表明,我们的方法在任务和语言上都大大提高了性能,并为跨语言数据量和翻译数据整合对增强多语言一致性和准确性的影响提供了广泛的洞察。

URL

https://arxiv.org/abs/2404.11932

PDF

https://arxiv.org/pdf/2404.11932.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot