The CUHK-TUDELFT System for The SLT 2021 Children Speech Recognition Challenge

Abstract
Abstract (translated)
URL
PDF

Abstract

This technical report describes our submission to the 2021 SLT Children Speech Recognition Challenge (CSRC) Track 1. Our approach combines the use of a joint CTC-attention end-to-end (E2E) speech recognition framework, transfer learning, data augmentation and development of various language models. Procedures of data pre-processing, the background and the course of system development are described. The analysis of the experiment results, as well as the comparison between the E2E and DNN-HMM hybrid system are discussed in detail. Our system achieved a character error rate (CER) of 20.1% in our designated test set, and 23.6% in the official evaluation set, which is placed at 10-th overall.

Abstract (translated)

URL

https://arxiv.org/abs/2011.06239

PDF

https://arxiv.org/pdf/2011.06239.pdf