Abstract
Artificial neural networks trained on large, expert-labelled datasets are considered state-of-the-art for a range of medical image recognition tasks. However, categorically labelled datasets are time-consuming to generate and constrain classification to a pre-defined, fixed set of classes. For neuroradiological applications in particular, this represents a barrier to clinical adoption. To address these challenges, we present a self-supervised text-vision framework that learns to detect clinically relevant abnormalities in brain MRI scans by directly leveraging the rich information contained in accompanying free-text neuroradiology reports. Our training approach consisted of two-steps. First, a dedicated neuroradiological language model - NeuroBERT - was trained to generate fixed-dimensional vector representations of neuroradiology reports (N = 50,523) via domain-specific self-supervised learning tasks. Next, convolutional neural networks (one per MRI sequence) learnt to map individual brain scans to their corresponding text vector representations by optimising a mean square error loss. Once trained, our text-vision framework can be used to detect abnormalities in unreported brain MRI examinations by scoring scans against suitable query sentences (e.g., 'there is an acute stroke', 'there is hydrocephalus' etc.), enabling a range of classification-based applications including automated triage. Potentially, our framework could also serve as a clinical decision support tool, not only by suggesting findings to radiologists and detecting errors in provisional reports, but also by retrieving and displaying examples of pathologies from historical examinations that could be relevant to the current case based on textual descriptors.
Abstract (translated)
通过在大型、专家标注的数据集上训练的人工神经网络被认为是各种医学图像识别任务的当前最先进的。然而,分类标注的数据集需要花费较长的时间来生成,并限制将分类限制为预定义、固定的类。特别是,在神经放射学应用中,这代表了临床采用的障碍。为了应对这些挑战,我们提出了一个自监督的文本视觉框架,通过直接利用伴随的免费文本神经放射学报告中的丰富信息来检测临床相关的异常脑MRI扫描。我们的训练方法包括两个步骤。首先,一个专门的语言模型——NeuroBERT 通过领域特定的自监督学习任务训练,生成固定维度的神经放射学报告的固定维向量表示(N = 50,523)。接下来,卷积神经网络(每个MRI序列一个)通过优化均方误差损失来学习将单个脑扫描映射到相应的文本向量表示。经过训练后,我们的文本视觉框架可用于通过评分扫描与适当的查询句子(例如,“有急性中风”,“有高血压”等)相匹配来检测未报告的脑MRI examination中的异常,实现各种分类基础应用(包括自动分类分诊)。可能的是,我们的框架还可以作为临床决策支持工具,不仅通过向放射科医生建议发现,还通过根据文本描述检索和显示历史检查中的疾病实例来发挥作用。
URL
https://arxiv.org/abs/2405.02782