An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution

Abstract
Abstract (translated)
URL
PDF

Abstract

Automated speaking assessment (ASA) typically involves automatic speech recognition (ASR) and hand-crafted feature extraction from the ASR transcript of a learner's speech. Recently, self-supervised learning (SSL) has shown stellar performance compared to traditional methods. However, SSL-based ASA systems are faced with at least three data-related challenges: limited annotated data, uneven distribution of learner proficiency levels and non-uniform score intervals between different CEFR proficiency levels. To address these challenges, we explore the use of two novel modeling strategies: metric-based classification and loss reweighting, leveraging distinct SSL-based embedding features. Extensive experimental results on the ICNALE benchmark dataset suggest that our approach can outperform existing strong baselines by a sizable margin, achieving a significant improvement of more than 10% in CEFR prediction accuracy.

Abstract (translated)

自动口语评估（ASA）通常涉及自动语音识别（ASR）和从学习者的语音ASR转录中手工提取特征。近年来，自监督学习（SSL）在传统方法中表现出优异性能。然而，基于SSL的ASA系统面临至少三个数据相关挑战：有限的标注数据、学习者水平分布不均以及不同CEFR水平之间的分数间隔非均匀。为了应对这些挑战，我们探讨了使用两种新颖的建模策略：基于指标的分类和损失加权，利用独特的SSL基体特征。在ICNALE基准数据集上的广泛实验结果表明，我们的方法可以显著优于现有强大的基线，实现CEFR预测准确性的提高超过10％。

URL

https://arxiv.org/abs/2404.07575

PDF

https://arxiv.org/pdf/2404.07575.pdf

An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution

Abstract

Abstract (translated)

URL

PDF Copy

PDF