Abstract
Automatic speech recognition (ASR) plays a pivotal role in our daily lives, offering utility not only for interacting with machines but also for facilitating communication for individuals with either partial or profound hearing impairments. The process involves receiving the speech signal in analogue form, followed by various signal processing algorithms to make it compatible with devices of limited capacity, such as cochlear implants (CIs). Unfortunately, these implants, equipped with a finite number of electrodes, often result in speech distortion during synthesis. Despite efforts by researchers to enhance received speech quality using various state-of-the-art signal processing techniques, challenges persist, especially in scenarios involving multiple sources of speech, environmental noise, and other circumstances. The advent of new artificial intelligence (AI) methods has ushered in cutting-edge strategies to address the limitations and difficulties associated with traditional signal processing techniques dedicated to CIs. This review aims to comprehensively review advancements in CI-based ASR and speech enhancement, among other related aspects. The primary objective is to provide a thorough overview of metrics and datasets, exploring the capabilities of AI algorithms in this biomedical field, summarizing and commenting on the best results obtained. Additionally, the review will delve into potential applications and suggest future directions to bridge existing research gaps in this domain.
Abstract (translated)
自动语音识别(ASR)在我们的日常生活中扮演着关键角色,不仅为我们与机器的互动提供了便利,还为患有部分或严重听力障碍的个人提供了一种便利,以进行交流。该过程涉及将语音信号以模拟形式接收,然后通过各种信号处理算法使其与具有有限容量的设备(如人工耳蜗)兼容。然而,这些植入物配备的有限数量的电极往往会导致合成过程中出现语音 distortion。尽管研究人员通过使用各种最先进的信号处理技术来提高接收到的语音质量,但挑战仍然存在,尤其是在涉及多个语音来源、环境噪声和其他情况的情况下。ASR技术的出现为解决与传统信号处理技术相关的CIs限制和困难带来了尖端策略。 本次综述旨在全面回顾基于CI的ASR和语音增强以及其他相关方面的发展。主要目标是为读者提供对这一生物医学领域AI算法的深入概述,总结和评论最佳结果。此外,本次综述将深入探讨潜在应用,并为该领域未来的研究方向提供建议,以弥合现有研究空白。
URL
https://arxiv.org/abs/2403.15442