Abstract
Transfer learning with a unified Transformer framework (T5) that converts all language problems into a text-to-text format has recently been proposed as a simple, yet effective, transfer learning approach. Although a multilingual version of the T5 model (mT5) has been introduced, it is not clear how well it can fare on non-English tasks involving diverse data. To investigate this question, we apply mT5 on a language with a wide variety of dialects--Arabic. For evaluation, we use an existing benchmark for Arabic language understanding and introduce a new benchmark for Arabic language generation (ARGEN). We also pre-train three powerful Arabic-specific text-to-text Transformer based models and evaluate them on the two benchmarks. Our new models perform significantly better than mT5 and exceed MARBERT, the current state-of-the-art Arabic BERT-based model, on Arabic language understanding. The models also set new SOTA on the generation benchmark. Our new models and are publicly released at this https URL and ARLGE will be released through the same repository.
Abstract (translated)
URL
https://arxiv.org/abs/2109.12068