Abstract
Previous work on clinical relation extraction from free-text sentences leveraged information about semantic types from clinical knowledge bases as a part of entity representations. In this paper, we exploit additional evidence by also making use of domain-specific semantic type dependencies. We encode the relation between a span of tokens matching a Unified Medical Language System (UMLS) concept and other tokens in the sentence. We implement our method and compare against different named entity recognition (NER) architectures (i.e., BiLSTM-CRF and BiLSTM-GCN-CRF) using different pre-trained clinical embeddings (i.e., BERT, BioBERT, UMLSBert). Our experimental results on clinical datasets show that in some cases NER effectiveness can be significantly improved by making use of domain-specific semantic type dependencies. Our work is also the first study generating a matrix encoding to make use of more than three dependencies in one pass for the NER task.
Abstract (translated)
之前关于从自由文本句子中提取临床关系的工作利用了来自临床知识库的语义类型信息作为实体表示的一部分。在本文中,我们通过使用特定领域的语义类型依赖性来利用额外的信息证据。我们将匹配统一医学语言系统(UMLS)概念的一段标记与该句中的其他标记之间的关系进行编码。 我们在不同的命名实体识别(NER)架构(即BiLSTM-CRF和BiLSTM-GCN-CRF)和不同预训练的临床嵌入(例如BERT、BioBERT、UMLSBert)上实现了我们的方法,并进行了对比实验。在临床数据集上的实验证明,在某些情况下,通过利用特定领域的语义类型依赖性可以显著提高NER的有效性。 此外,本研究首次生成矩阵编码以在同一遍中使用超过三种依赖关系来完成NER任务。
URL
https://arxiv.org/abs/2503.05373