Efficient Joint Learning for Clinical Named Entity Recognition and Relation Extraction Using Fourier Networks: A Use Case in Adverse Drug Events

Abstract
Abstract (translated)
URL
PDF

Abstract

Current approaches for clinical information extraction are inefficient in terms of computational costs and memory consumption, hindering their application to process large-scale electronic health records (EHRs). We propose an efficient end-to-end model, the Joint-NER-RE-Fourier (JNRF), to jointly learn the tasks of named entity recognition and relation extraction for documents of variable length. The architecture uses positional encoding and unitary batch sizes to process variable length documents and uses a weight-shared Fourier network layer for low-complexity token mixing. Finally, we reach the theoretical computational complexity lower bound for relation extraction using a selective pooling strategy and distance-aware attention weights with trainable polynomial distance functions. We evaluated the JNRF architecture using the 2018 N2C2 ADE benchmark to jointly extract medication-related entities and relations in variable-length EHR summaries. JNRF outperforms rolling window BERT with selective pooling by 0.42%, while being twice as fast to train. Compared to state-of-the-art BiLSTM-CRF architectures on the N2C2 ADE benchmark, results show that the proposed approach trains 22 times faster and reduces GPU memory consumption by 1.75 folds, with a reasonable performance tradeoff of 90%, without the use of external tools, hand-crafted rules or post-processing. Given the significant carbon footprint of deep learning models and the current energy crises, these methods could support efficient and cleaner information extraction in EHRs and other types of large-scale document databases.

Abstract (translated)

当前用于临床信息提取的方法在计算成本和内存消耗方面效率低下,限制了它们在处理大规模电子健康记录(EHRs)中的应用。我们提出了一种高效的端到端模型,即 Joint-NER-RE-Fourier(JNRF),用于同时学习长度可变文档中的命名实体识别和关系提取任务。该架构使用位置编码和单元批次大小处理长度可变文档,并使用共享权重的傅里叶网络层进行低复杂度 token 混合。最后,我们使用选择聚合策略和可训练的多项式距离函数的离群注意力权重来达到关系提取的理论计算复杂度 lower bound。我们使用 2018 N2C2 ADE 基准测试集,通过选择聚合来同时提取长度可变的 EHR 摘要中的药物相关实体和关系。JNRF 在选择聚合方面比滚动窗口BERT表现更好,比训练速度更快。与 2018 N2C2 ADE 基准测试集上最先进的 BiLSTM-CRF 架构相比,结果表明,我们提出的方法训练速度更快,GPU 内存消耗降低了 1.75 倍,合理的性能权衡为 90%,而不需要外部工具、手动制定规则或后处理。考虑到深度学习模型的巨大碳排放和当前的能源危机,这些方法可以支持 EHR 和其他类型大规模文档数据库中高效和清洁的信息提取。

URL

https://arxiv.org/abs/2302.04185

PDF

https://arxiv.org/pdf/2302.04185.pdf