Abstract
Cognitive decline is a natural part of aging, often resulting in reduced cognitive abilities. In some cases, however, this decline is more pronounced, typically due to disorders such as Alzheimer's disease. Early detection of anomalous cognitive decline is crucial, as it can facilitate timely professional intervention. While medical data can help in this detection, it often involves invasive procedures. An alternative approach is to employ non-intrusive techniques such as speech or handwriting analysis, which do not necessarily affect daily activities. This survey reviews the most relevant methodologies that use deep learning techniques to automate the cognitive decline estimation task, including audio, text, and visual processing. We discuss the key features and advantages of each modality and methodology, including state-of-the-art approaches like Transformer architecture and foundation models. In addition, we present works that integrate different modalities to develop multimodal models. We also highlight the most significant datasets and the quantitative results from studies using these resources. From this review, several conclusions emerge. In most cases, the textual modality achieves the best results and is the most relevant for detecting cognitive decline. Moreover, combining various approaches from individual modalities into a multimodal model consistently enhances performance across nearly all scenarios.
Abstract (translated)
认知衰退是老化过程中的自然现象,通常会导致认知能力的下降。然而,在某些情况下,这种衰退更加明显,通常是由于阿尔茨海默病等疾病所致。早期检测异常的认知衰退至关重要,因为它可以促进及时的专业干预。虽然医学数据有助于此类检测,但往往涉及侵入性程序。一种替代方法是使用非侵扰性的技术,如语音或笔迹分析,这些技术通常不会影响日常活动。本综述回顾了利用深度学习技术自动化认知衰退估计任务的最相关方法,包括音频、文本和视觉处理。我们讨论了每种模态及其方法的关键特征与优势,包括最先进的方法如Transformer架构和基础模型。此外,我们还介绍了整合不同模态以开发多模态模型的工作成果。我们也强调了一些最重要的数据集以及利用这些资源的研究的定量结果。从这一综述中,我们可以得出几个结论。在大多数情况下,文本模态获得了最佳效果,并且对于检测认知衰退最为相关。此外,在几乎所有场景下,将来自单个模态的各种方法结合到一个多模态模型中可以持续提升性能。
URL
https://arxiv.org/abs/2410.18972