English2Gbe: A multilingual machine translation model for {Fon/Ewe}Gbe

2021-12-13 10:35:09

Gilles Hacheme

arXiv_CL

arXiv_CL NMT

Abstract
Abstract (translated)
URL
PDF

Abstract

Language is an essential factor of emancipation. Unfortunately, most of the more than 2,000 African languages are low-resourced. The community has recently used machine translation to revive and strengthen several African languages. However, the trained models are often bilingual, resulting in a potentially exponential number of models to train and maintain to cover all possible translation directions. Additionally, bilingual models do not leverage the similarity between some of the languages. Consequently, multilingual neural machine translation (NMT) is gaining considerable interest, especially for low-resourced languages. Nevertheless, its adoption by the community is still limited. This paper introduces English2Gbe, a multilingual NMT model capable of translating from English to Ewe or Fon. Using the BLEU, CHRF, and TER scores computed with the Sacrebleu (Post, 2018) package for reproducibility, we show that English2Gbe outperforms bilingual models (English to Ewe and English to Fon) and gives state-of-the-art results on the JW300 benchmark for Fon established by Nekoto et al. (2020). We hope this work will contribute to the massive adoption of Multilingual models inside the community. Our code is made accessible from Github.

Abstract (translated)

URL

https://arxiv.org/abs/2112.11482

PDF

https://arxiv.org/pdf/2112.11482.pdf