Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer

2022-05-24 15:28:09

Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord, Sebastian Ruder

arXiv_CL

arXiv_CL Transfer_Learning Embedding Knowledge Pose Few-Shot Zero-Shot

Abstract
Abstract (translated)
URL
PDF

Abstract

Massively multilingual models are promising for transfer learning across tasks and languages. However, existing methods are unable to fully leverage training data when it is available in different task-language combinations. To exploit such heterogeneous supervision we propose Hyper-X, a unified hypernetwork that generates weights for parameter-efficient adapter modules conditioned on both tasks and language embeddings. By learning to combine task and language-specific knowledge our model enables zero-shot transfer for unseen languages and task-language combinations. Our experiments on a diverse set of languages demonstrate that Hyper-X achieves the best gain when a mixture of multiple resources is available while performing on par with strong baselines in the standard scenario. Finally, Hyper-X consistently produces strong results in few-shot scenarios for new languages and tasks showing the effectiveness of our approach beyond zero-shot transfer.

Abstract (translated)

URL

https://arxiv.org/abs/2205.12148

PDF

https://arxiv.org/pdf/2205.12148.pdf