Analysis about Theoretical Foundations for Method to Enhancing ASR Performance using OCR Word Frequency Differences

Abstract
Abstract (translated)
URL
PDF

Abstract

As interest in large language models (LLMs) grows, the importance of accuracy in automatic speech recognition (ASR) has become more pronounced. This is particularly true for lectures that include specialized terminology, where the success rate of traditional ASR models tends to be low, posing a challenging problem. A method to improve ASR performance for specialized terminology using the word frequency difference approach has been proposed. Through experiments and data analysis, we investigate whether this proposal effectively addresses the issue. Additionally, we introduce the power law as the theoretical foundation for the relative frequency

Abstract (translated)

随着大型语言模型（LLMs）的兴趣不断增长，自动语音识别（ASR）中准确性的重要性变得更加突出。尤其是在包括专业术语的讲座中，传统ASR模型的成功率往往较低，这构成了具有挑战性的问题。提出了一种利用词频差异方法提高专业术语ASR性能的方法。通过实验和数据分析，我们研究了这一建议是否有效解决了这个问题。此外，我们还介绍了幂律作为相对频率的理论基础。

URL

https://arxiv.org/abs/2405.02995

PDF

https://arxiv.org/pdf/2405.02995.pdf

Analysis about Theoretical Foundations for Method to Enhancing ASR Performance using OCR Word Frequency Differences

Abstract

Abstract (translated)

URL

PDF Copy

PDF