Learning ASR pathways: A sparse multilingual ASR model

2022-09-13 05:14:08

Mu Yang, Andros Tjandra, Chunxi Liu, David Zhang, Duc Le, John H. L. Hansen, Ozlem Kalinli

arXiv_CL

arXiv_CL Speech_Recognition RNN Recognition Sparse Knowledge Pose Speech

Abstract
Abstract (translated)
URL
PDF

Abstract

Neural network pruning can be effectively applied to compress automatic speech recognition (ASR) models. However, in multilingual ASR, performing language-agnostic pruning may lead to severe performance degradation on some languages because language-agnostic pruning masks may not fit all languages and discard important language-specific parameters. In this work, we present ASR pathways, a sparse multilingual ASR model that activates language-specific sub-networks ("pathways"), such that the parameters for each language are learned explicitly. With the overlapping sub-networks, the shared parameters can also enable knowledge transfer for lower resource languages via joint multilingual training. We propose a novel algorithm to learn ASR pathways, and evaluate the proposed method on 4 languages with a streaming RNN-T model. Our proposed ASR pathways outperform both dense models (-5.0% average WER) and a language-agnostically pruned model (-21.4% average WER), and provide better performance on low-resource languages compared to the monolingual sparse models.

Abstract (translated)

URL

https://arxiv.org/abs/2209.05735

PDF

https://arxiv.org/pdf/2209.05735.pdf