Finding Sparse Structure for Domain Specific Neural Machine Translation

2020-12-19 03:33:27

Jianze Liang, Chengqi Zhao, Mingxuan Wang, Xipeng Qiu, Lei Li

arXiv_AI

Abstract
Abstract (translated)
URL
PDF

Abstract

Fine-tuning is a major approach for domain adaptation in Neural Machine Translation (NMT). However, unconstrained fine-tuning requires very careful hyper-parameter tuning otherwise it is easy to fall into over-fitting on the target domain and degradation on the general domain. To mitigate it, we propose PRUNE-TUNE, a novel domain adaptation method via gradual pruning. It learns tiny domain-specific subnetworks for tuning. During adaptation to a new domain, we only tune its corresponding subnetwork. PRUNE-TUNE alleviates the over-fitting and the degradation problem without model modification. Additionally, with no overlapping between domain-specific subnetworks, PRUNE-TUNE is also capable of sequential multi-domain learning. Empirical experiment results show that PRUNE-TUNE outperforms several strong competitors in the target domain test set without the quality degradation of the general domain in both single and multiple domain settings.

Abstract (translated)

URL

https://arxiv.org/abs/2012.10586

PDF

https://arxiv.org/pdf/2012.10586.pdf