Fast transcription of speech in low-resource languages

2019-09-16 15:38:36

Mark Hasegawa-Johnson, Camille Goudeseune, Gina-Anne Levow

arXiv_CL

arXiv_CL Language_Model Speech

Abstract
Abstract (translated)
URL
PDF

Abstract

We present software that, in only a few hours, transcribes forty hours of recorded speech in a surprise language, using only a few tens of megabytes of noisy text in that language, and a zero-resource grapheme to phoneme (G2P) table. A pretrained acoustic model maps acoustic features to phonemes; a reversed G2P maps these to graphemes; then a language model maps these to a most-likely grapheme sequence, i.e., a transcription. This software has worked successfully with corpora in Arabic, Assam, Kinyarwanda, Russian, Sinhalese, Swahili, Tagalog, and Tamil.

Abstract (translated)

URL

https://arxiv.org/abs/1909.07285

PDF

https://arxiv.org/pdf/1909.07285.pdf