Order Matters in the Presence of Dataset Imbalance for Multilingual Learning

2023-12-11 05:46:57

Dami Choi, Derrick Xin, Hamid Dadkhahi, Justin Gilmer, Ankush Garg, Orhan Firat, Chih-Kuan Yeh, Andrew M. Dai, Behrooz Ghorbani

arXiv_CL

Abstract
Abstract (translated)
URL
PDF

Abstract

In this paper, we empirically study the optimization dynamics of multi-task learning, particularly focusing on those that govern a collection of tasks with significant data imbalance. We present a simple yet effective method of pre-training on high-resource tasks, followed by fine-tuning on a mixture of high/low-resource tasks. We provide a thorough empirical study and analysis of this method's benefits showing that it achieves consistent improvements relative to the performance trade-off profile of standard static weighting. We analyze under what data regimes this method is applicable and show its improvements empirically in neural machine translation (NMT) and multi-lingual language modeling.

Abstract (translated)

在本文中，我们通过经验研究了多任务学习（Multi-task Learning,MT）的优化动态，特别关注那些具有显著数据不平衡的一组任务的优化。我们提出了一个在高端任务上进行预训练，然后在小/高端任务上进行微调的有效方法。我们对这种方法的益处进行了详细的实证研究和分析，表明它相对于标准静态加权方案实现了稳健的改善。我们分析了这种方法适用于哪些数据模式，并用电文机器翻译（NMT）和多语言语言建模等实证研究证明了它的改善。

URL

https://arxiv.org/abs/2312.06134

PDF

https://arxiv.org/pdf/2312.06134.pdf

Order Matters in the Presence of Dataset Imbalance for Multilingual Learning

Abstract

Abstract (translated)

URL

PDF Copy

PDF