Abstract
Sign languages, often categorised as low-resource languages, face significant challenges in achieving accurate translation due to the scarcity of parallel annotated datasets. This paper introduces Select and Reorder (S&R), a novel approach that addresses data scarcity by breaking down the translation process into two distinct steps: Gloss Selection (GS) and Gloss Reordering (GR). Our method leverages large spoken language models and the substantial lexical overlap between source spoken languages and target sign languages to establish an initial alignment. Both steps make use of Non-AutoRegressive (NAR) decoding for reduced computation and faster inference speeds. Through this disentanglement of tasks, we achieve state-of-the-art BLEU and Rouge scores on the Meine DGS Annotated (mDGS) dataset, demonstrating a substantial BLUE-1 improvement of 37.88% in Text to Gloss (T2G) Translation. This innovative approach paves the way for more effective translation models for sign languages, even in resource-constrained settings.
Abstract (translated)
手语,通常被归类为低资源语言,在实现准确翻译时面临重大挑战,因为缺乏并行注释数据集。本文介绍了一种名为Select and Reorder(S&R)的新方法,通过将翻译过程划分为两个截然不同的步骤:词镜选择(GS)和词镜排序(GR)来解决数据稀缺问题。我们的方法利用了大型口语语言模型和源口语语言与目标手语语言之间巨大的词汇重叠,建立了一个初步的对齐。两个步骤都利用了非自回归(NAR)解码来减少计算并实现更快的推理速度。通过这种任务的分离,我们在Meine DGS注释数据集上实现了最先进的BLEU和ROUGE分数,证明了在资源受限的环境中,T2G翻译的BLUE-1改进了37.88%。这种创新方法为在资源受限的环境中实现更有效的手语语言翻译模型铺平了道路。
URL
https://arxiv.org/abs/2404.11532