Abstract
Objective: To develop a natural language processing system that solves both clinical concept extraction and relation extraction in a unified prompt-based machine reading comprehension (MRC) architecture with good generalizability for cross-institution applications. Methods: We formulate both clinical concept extraction and relation extraction using a unified prompt-based MRC architecture and explore state-of-the-art transformer models. We compare our MRC models with existing deep learning models for concept extraction and end-to-end relation extraction using two benchmark datasets developed by the 2018 National NLP Clinical Challenges (n2c2) challenge (medications and adverse drug events) and the 2022 n2c2 challenge (relations of social determinants of health [SDoH]). We also evaluate the transfer learning ability of the proposed MRC models in a cross-institution setting. We perform error analyses and examine how different prompting strategies affect the performance of MRC models. Results and Conclusion: The proposed MRC models achieve state-of-the-art performance for clinical concept and relation extraction on the two benchmark datasets, outperforming previous non-MRC transformer models. GatorTron-MRC achieves the best strict and lenient F1-scores for concept extraction, outperforming previous deep learning models on the two datasets by 1%~3% and 0.7%~1.3%, respectively. For end-to-end relation extraction, GatorTron-MRC and BERT-MIMIC-MRC achieve the best F1-scores, outperforming previous deep learning models by 0.9%~2.4% and 10%-11%, respectively. For cross-institution evaluation, GatorTron-MRC outperforms traditional GatorTron by 6.4% and 16% for the two datasets, respectively. The proposed method is better at handling nested/overlapped concepts, extracting relations, and has good portability for cross-institute applications.
Abstract (translated)
目标:开发一种自然语言处理系统,解决临床概念提取和关系提取在统一prompt-based机器阅读理解(MRC)架构中,且具有较好的跨机构通用性。方法:采用统一prompt-based MRC架构,并探索最先进的Transformer模型。我们使用2018年国家自然语言处理临床挑战(n2c2)挑战的两个基准数据集(药物和不良反应)和2022年n2c2挑战(卫生保健和社会因素之间的关系)来比较我们的MRC模型与现有的深度学习模型,以概念提取和端到端关系提取。我们还在不同机构环境下评估了 proposed MRC模型的迁移学习能力。我们进行错误分析,并检查不同prompt策略如何影响MRC模型的表现。结果和结论:提出的MRC模型在两个基准数据集上实现最佳的临床概念和关系提取性能,比以前的非MRCTransformer模型表现更好。GatorTron-MRC实现最佳的严格和宽松的F1得分,在两个数据集上比以前的深度学习模型分别提高了1%~3%和0.7%~1.3%。对于端到端关系提取,GatorTron-MRC和BERT-MIMIC-MRC实现最佳的F1得分,比以前的深度学习模型分别提高了0.9%~2.4%和10%-11%。对于跨机构评估,GatorTron-MRC在两个数据集上比传统的GatorTron分别提高了6.4%和16%。该方法更好地处理嵌套和重叠的概念提取,提取关系,对于跨机构应用具有较好的通用性。
URL
https://arxiv.org/abs/2303.08262