Communication Efficient Federated Learning for Multilingual Neural Machine Translation with Adapter

Abstract
Abstract (translated)
URL
PDF

Abstract

Federated Multilingual Neural Machine Translation (Fed-MNMT) has emerged as a promising paradigm for institutions with limited language resources. This approach allows multiple institutions to act as clients and train a unified model through model synchronization, rather than collecting sensitive data for centralized training. This significantly reduces the cost of corpus collection and preserves data privacy. However, as pre-trained language models (PLMs) continue to increase in size, the communication cost for transmitting parameters during synchronization has become a training speed bottleneck. In this paper, we propose a communication-efficient Fed-MNMT framework that addresses this issue by keeping PLMs frozen and only transferring lightweight adapter modules between clients. Since different language pairs exhibit substantial discrepancies in data distributions, adapter parameters of clients may conflict with each other. To tackle this, we explore various clustering strategies to group parameters for integration and mitigate the negative effects of conflicting parameters. Experimental results demonstrate that our framework reduces communication cost by over 98% while achieving similar or even better performance compared to competitive baselines. Further analysis reveals that clustering strategies effectively solve the problem of linguistic discrepancy and pruning adapter modules further improves communication efficiency.

Abstract (translated)

Federated Multilingual Neural Machine Translation (Fed-MNMT)已经成为缺乏语言资源机构的一个有前途的范式。这种方法允许多个机构作为客户，通过模型同步训练一个统一模型，而不是收集敏感数据进行集中训练。这 significantly降低了语料收集和数据隐私的成本。然而，随着预训练语言模型(PLMs)的越来越大，在同步期间传输参数的通信成本已成为训练速度的瓶颈。在本文中，我们提出了一个通信高效的Fed-MNMT框架，解决这个问题的方法是保持PLMs冻结，仅向客户传输轻量级适配模块。由于不同语言对在数据分布上存在显著差异，客户的适配参数可能会相互冲突。为了解决这一问题，我们探索了各种聚类策略，将参数进行集成并减轻冲突参数的负面影响。实验结果显示，我们的框架可以减少通信成本超过98%，与竞争基准相比，实现类似或甚至更好的性能。进一步分析表明，聚类策略有效地解决了语言差异问题，修剪适配模块进一步提高了通信效率。

URL

https://arxiv.org/abs/2305.12449

PDF

https://arxiv.org/pdf/2305.12449.pdf