Large language models (LLMs) encode parametric knowledge about world facts and have shown remarkable performance in knowledge-driven NLP tasks. However, their reliance on parametric knowledge may cause them to overlook contextual cues, leading to incorrect predictions in context-sensitive NLP tasks (e.g., knowledge acquisition tasks). In this paper, we seek to assess and enhance LLMs' contextual faithfulness in two aspects: knowledge conflict and prediction with abstention. We demonstrate that LLMs' faithfulness can be significantly improved using carefully designed prompting strategies. In particular, we identify opinion-based prompts and counterfactual demonstrations as the most effective methods. Opinion-based prompts reframe the context as a narrator's statement and inquire about the narrator's opinions, while counterfactual demonstrations use instances containing false facts to improve faithfulness in knowledge conflict situations. Neither technique requires additional training. We conduct experiments on three datasets of two standard NLP tasks, machine reading comprehension and relation extraction, and the results demonstrate significant improvement in faithfulness to contexts.
大型语言模型(LLMs)编码了对世界事实的参数知识,并在知识驱动的NLP任务中表现出出色的性能。然而,他们依赖参数知识可能会导致他们忽略上下文线索,从而导致在上下文敏感的NLP任务(例如知识获取任务)中错误的预测。在本文中,我们希望评估和增强LLMs的上下文准确性,两个方面:知识冲突和预测并保持中立。我们证明了使用精心设计的prompting策略,可以显著改善LLMs的上下文准确性。特别是,我们确定了基于意见的prompt和反事实演示是最有效的方法。基于意见的prompt将上下文改写为主持人的声明,并询问主持人的意见,而反事实演示则使用包含虚假事实的例子来改善知识冲突情况下的上下文准确性。两种方法都不需要额外的训练。我们对两个标准NLP任务的三个数据集进行了实验,即机器阅读理解和关系提取,结果表明,在上下文准确性方面取得了显著的改善。
https://arxiv.org/abs/2303.11315
Joint entity and relation extraction (JERE) is one of the most important tasks in information extraction. However, most existing works focus on sentence-level coarse-grained JERE, which have limitations in real-world scenarios. In this paper, we construct a large-scale document-level fine-grained JERE dataset DocRED-FE, which improves DocRED with Fine-Grained Entity Type. Specifically, we redesign a hierarchical entity type schema including 11 coarse-grained types and 119 fine-grained types, and then re-annotate DocRED manually according to this schema. Through comprehensive experiments we find that: (1) DocRED-FE is challenging to existing JERE models; (2) Our fine-grained entity types promote relation classification. We make DocRED-FE with instruction and the code for our baselines publicly available at this https URL.
联合实体和关系提取(JERE)是信息提取中的最重要任务之一。然而,大多数现有工作集中在句子级别的粗粒度JERE,在实际应用中存在一些限制。在本文中,我们建立了一个大规模的文档级别的精细粒度JERE数据集 DocRED-FE,以改进基于精细实体类型的DocRED。具体来说,我们重新设计了包括11个粗粒度类型和119个精细粒度类型的层级实体类型 schema,然后根据这个 schema 手动重新注释 DocRED。通过全面实验,我们发现:(1) DocRED-FE对现有的JERE模型具有挑战性;(2)我们的精细实体类型促进了关系分类。我们将 DocRED-FE与指令和我们的基准代码在此https URL上公开发布。
https://arxiv.org/abs/2303.11141
Given comparative text, comparative relation extraction aims to extract two targets (\eg two cameras) in comparison and the aspect they are compared for (\eg image quality). The extracted comparative relations form the basis of further opinion analysis.Existing solutions formulate this task as a sequence labeling task, to extract targets and aspects. However, they cannot directly extract comparative relation(s) from text. In this paper, we show that comparative relations can be directly extracted with high accuracy, by generative model. Based on GPT-2, we propose a Generation-based Comparative Relation Extractor (GCRE-GPT). Experiment results show that \modelname achieves state-of-the-art accuracy on two datasets.
根据比较文本,比较关系提取的目标是提取两个对象(例如两个相机)之间的比较关系以及它们比较方面(例如图像质量)。提取出来的比较关系将成为进一步观点分析的基础。现有的解决方案将这一任务定义为序列标注任务,以提取对象和方面。然而,它们不能直接从文本中提取比较关系。在本文中,我们证明可以通过生成模型直接提取高准确性的比较关系。基于GPT-2,我们提出了一种基于生成式的比较关系提取方法(GCRE-GPT)。实验结果显示,该模型在两个数据集上实现了最先进的准确性。
https://arxiv.org/abs/2303.08601
Objective: To develop a natural language processing system that solves both clinical concept extraction and relation extraction in a unified prompt-based machine reading comprehension (MRC) architecture with good generalizability for cross-institution applications. Methods: We formulate both clinical concept extraction and relation extraction using a unified prompt-based MRC architecture and explore state-of-the-art transformer models. We compare our MRC models with existing deep learning models for concept extraction and end-to-end relation extraction using two benchmark datasets developed by the 2018 National NLP Clinical Challenges (n2c2) challenge (medications and adverse drug events) and the 2022 n2c2 challenge (relations of social determinants of health [SDoH]). We also evaluate the transfer learning ability of the proposed MRC models in a cross-institution setting. We perform error analyses and examine how different prompting strategies affect the performance of MRC models. Results and Conclusion: The proposed MRC models achieve state-of-the-art performance for clinical concept and relation extraction on the two benchmark datasets, outperforming previous non-MRC transformer models. GatorTron-MRC achieves the best strict and lenient F1-scores for concept extraction, outperforming previous deep learning models on the two datasets by 1%~3% and 0.7%~1.3%, respectively. For end-to-end relation extraction, GatorTron-MRC and BERT-MIMIC-MRC achieve the best F1-scores, outperforming previous deep learning models by 0.9%~2.4% and 10%-11%, respectively. For cross-institution evaluation, GatorTron-MRC outperforms traditional GatorTron by 6.4% and 16% for the two datasets, respectively. The proposed method is better at handling nested/overlapped concepts, extracting relations, and has good portability for cross-institute applications.
目标:开发一种自然语言处理系统,解决临床概念提取和关系提取在统一prompt-based机器阅读理解(MRC)架构中,且具有较好的跨机构通用性。方法:采用统一prompt-based MRC架构,并探索最先进的Transformer模型。我们使用2018年国家自然语言处理临床挑战(n2c2)挑战的两个基准数据集(药物和不良反应)和2022年n2c2挑战(卫生保健和社会因素之间的关系)来比较我们的MRC模型与现有的深度学习模型,以概念提取和端到端关系提取。我们还在不同机构环境下评估了 proposed MRC模型的迁移学习能力。我们进行错误分析,并检查不同prompt策略如何影响MRC模型的表现。结果和结论:提出的MRC模型在两个基准数据集上实现最佳的临床概念和关系提取性能,比以前的非MRCTransformer模型表现更好。GatorTron-MRC实现最佳的严格和宽松的F1得分,在两个数据集上比以前的深度学习模型分别提高了1%~3%和0.7%~1.3%。对于端到端关系提取,GatorTron-MRC和BERT-MIMIC-MRC实现最佳的F1得分,比以前的深度学习模型分别提高了0.9%~2.4%和10%-11%。对于跨机构评估,GatorTron-MRC在两个数据集上比传统的GatorTron分别提高了6.4%和16%。该方法更好地处理嵌套和重叠的概念提取,提取关系,对于跨机构应用具有较好的通用性。
https://arxiv.org/abs/2303.08262
Recently, many studies incorporate external knowledge into character-level feature based models to improve the performance of Chinese relation extraction. However, these methods tend to ignore the internal information of the Chinese character and cannot filter out the noisy information of external knowledge. To address these issues, we propose a mixture-of-view-experts framework (MoVE) to dynamically learn multi-view features for Chinese relation extraction. With both the internal and external knowledge of Chinese characters, our framework can better capture the semantic information of Chinese characters. To demonstrate the effectiveness of the proposed framework, we conduct extensive experiments on three real-world datasets in distinct domains. Experimental results show consistent and significant superiority and robustness of our proposed framework. Our code and dataset will be released at: this https URL
最近,许多研究将外部知识纳入字符级别的特征模型中,以改善中文关系提取的性能。然而,这些方法往往忽略中文字符的内部信息,无法过滤掉外部知识的噪声信息。为了解决这些问题,我们提出了一种混合视角专家框架(MoVE),以动态学习中文关系提取的多个视角特征。通过中文字符的内部和外部知识,我们的框架可以更好地捕捉中文字符的语义信息。为了证明所提出框架的有效性,我们进行了广泛的实验,对三个不同的应用领域研究了三个真实的数据集。实验结果表明,我们提出的框架具有一致性和显著的优势,以及稳定性。我们的代码和数据集将在未来发布:这个 https URL。
https://arxiv.org/abs/2303.05082
Recent work has utilised knowledge-aware approaches to natural language understanding, question answering, recommendation systems, and other tasks. These approaches rely on well-constructed and large-scale knowledge graphs that can be useful for many downstream applications and empower knowledge-aware models with commonsense reasoning. Such knowledge graphs are constructed through knowledge acquisition tasks such as relation extraction and knowledge graph completion. This work seeks to utilise and build on the growing body of work that uses findings from the field of natural language processing (NLP) to extract knowledge from text and build knowledge graphs. The focus of this research project is on how we can use transformer-based approaches to extract and contextualise event information, matching it to existing ontologies, to build a comprehensive knowledge of graph-based event representations. Specifically, sub-event extraction is used as a way of creating sub-event-aware event representations. These event representations are then further enriched through fine-grained location extraction and contextualised through the alignment of historically relevant quotes.
最近的工作利用了知识 aware 的方法来实现自然语言理解、回答问题、推荐系统和其他任务。这些方法依赖于构建良好且大规模的知识图,对于许多后续应用非常有用,并利用常识推理使知识 aware 模型具有能力。这些知识图是通过关系提取和知识图完成的知识获取任务构建的。本研究的目标是利用和建立使用自然语言处理领域发现从文本中提取知识并构建知识图的方法,以建立基于图的事件表示的全面知识。具体而言,子事件提取被用来创建子事件aware的事件表示。这些事件表示随后通过精细的位置提取和通过历史相关引用的对齐进行进一步丰富。
https://arxiv.org/abs/2303.04794
Recent advancements in large language models (LLMs) have led to the development of highly potent models like OpenAI's ChatGPT. These models have exhibited exceptional performance in a variety of tasks, such as question answering, essay composition, and code generation. However, their effectiveness in the healthcare sector remains uncertain. In this study, we seek to investigate the potential of ChatGPT to aid in clinical text mining by examining its ability to extract structured information from unstructured healthcare texts, with a focus on biological named entity recognition and relation extraction. However, our preliminary results indicate that employing ChatGPT directly for these tasks resulted in poor performance and raised privacy concerns associated with uploading patients' information to the ChatGPT API. To overcome these limitations, we propose a new training paradigm that involves generating a vast quantity of high-quality synthetic data with labels utilizing ChatGPT and fine-tuning a local model for the downstream task. Our method has resulted in significant improvements in the performance of downstream tasks, improving the F1-score from 23.37% to 63.99% for the named entity recognition task and from 75.86% to 83.59% for the relation extraction task. Furthermore, generating data using ChatGPT can significantly reduce the time and effort required for data collection and labeling, as well as mitigate data privacy concerns. In summary, the proposed framework presents a promising solution to enhance the applicability of LLM models to clinical text mining.
近年来,大型语言模型(LLM)的发展已经导致了像OpenAI的ChatGPT等高性能模型的出现。这些模型在多种任务中表现出了卓越的表现,例如问题回答、作文撰写和代码生成。然而,它们在医疗领域的效力仍然不确定。在本研究中,我们希望研究ChatGPT在临床文本挖掘方面的潜力,通过检查其从无结构医疗文本中提取结构化信息的能力,重点关注生物命名实体识别和关系提取。然而,我们的初步结果显示,直接使用ChatGPT进行这些任务会导致表现不佳,并引发了与将患者信息上传到ChatGPTAPI相关的隐私担忧。为了克服这些限制,我们提出了一种新的训练范式,涉及利用ChatGPT生成大量带有标签的高质量合成数据,并为后续任务微调本地模型。我们的方法导致了后续任务表现的重大改善,例如命名实体识别任务F1得分从23.37%提高到了63.99%,关系提取任务从75.86%提高到了83.59%。此外,使用ChatGPT生成数据可以显著减少数据收集和标签标注所需的时间和努力,并缓解数据隐私担忧。总之,我们提出的框架提供了一个有前途的解决方案,以提高LLM模型对临床文本挖掘的适用性。
https://arxiv.org/abs/2303.04360
Relation extraction (RE) has recently moved from the sentence-level to document-level, which requires aggregating document information and using entities and mentions for reasoning. Existing works put entity nodes and mention nodes with similar representations in a document-level graph, whose complex edges may incur redundant information. Furthermore, existing studies only focus on entity-level reasoning paths without considering global interactions among entities cross-sentence. To these ends, we propose a novel document-level RE model with a GRaph information Aggregation and Cross-sentence Reasoning network (GRACR). Specifically, a simplified document-level graph is constructed to model the semantic information of all mentions and sentences in a document, and an entity-level graph is designed to explore relations of long-distance cross-sentence entity pairs. Experimental results show that GRACR achieves excellent performance on two public datasets of document-level RE. It is especially effective in extracting potential relations of cross-sentence entity pairs. Our code is available at this https URL.
关系提取(RE)最近从句子级别转移到文档级别,这需要将文档信息聚合并使用实体和提及进行推理。现有工作将实体节点和提及节点以类似的表现方式放在文档级别的图中,其复杂的边可能包含冗余信息。此外,现有研究仅关注实体级别的推理路径,未考虑跨句子实体之间的全球交互。为此,我们提出了一种新的文档级别的RE模型,结合GRaph信息聚合和跨句子推理网络(GRACR)。具体而言,我们创造了一种简化的文档级别的图来建模文档中所有提及和句子的语义信息,并设计了实体级别的图来探索长距离跨句子实体 pairs 的关系。实验结果表明,GRACR在两个文档级别的RE公共数据集上表现优异。它在提取跨句子实体 pairs 的潜在关系方面特别有效。我们的代码在这个httpsURL上可用。
https://arxiv.org/abs/2303.03912
NLP Workbench is a web-based platform for text mining that allows non-expert users to obtain semantic understanding of large-scale corpora using state-of-the-art text mining models. The platform is built upon latest pre-trained models and open source systems from academia that provide semantic analysis functionalities, including but not limited to entity linking, sentiment analysis, semantic parsing, and relation extraction. Its extensible design enables researchers and developers to smoothly replace an existing model or integrate a new one. To improve efficiency, we employ a microservice architecture that facilitates allocation of acceleration hardware and parallelization of computation. This paper presents the architecture of NLP Workbench and discusses the challenges we faced in designing it. We also discuss diverse use cases of NLP Workbench and the benefits of using it over other approaches. The platform is under active development, with its source code released under the MIT license. A website and a short video demonstrating our platform are also available.
NLP Workbench是一个基于Web的文本挖掘平台,它允许非专家用户使用最先进的文本挖掘模型,对大规模语料库进行语义理解。该平台基于学术界最新的预训练模型和开源系统,提供了语义分析功能,包括但不限于实体链接、情感分析、语义解析和关系提取。该平台可扩展的设计使得研究人员和开发人员可以轻松地更换现有模型或集成新的模型。为了提高效率,我们采用了微服务架构,便于分配加速硬件和并行计算。本文介绍了NLP Workbench的架构,并讨论了在设计该平台时所面临的挑战。我们还讨论了NLP Workbench多种使用场景以及使用它相比其他方法的优势。该平台正在积极开发,其源代码已采用MIT许可证发布。一个网站和一个简短的视频演示了我们的平台。
https://arxiv.org/abs/2303.01410
Relation tuple extraction from text is an important task for building knowledge bases. Recently, joint entity and relation extraction models have achieved very high F1 scores in this task. However, the experimental settings used by these models are restrictive and the datasets used in the experiments are not realistic. They do not include sentences with zero tuples (zero-cardinality). In this paper, we evaluate the state-of-the-art joint entity and relation extraction models in a more realistic setting. We include sentences that do not contain any tuples in our experiments. Our experiments show that there is significant drop ($\sim 10-15\%$ in one dataset and $\sim 6-14\%$ in another dataset) in their F1 score in this setting. We also propose a two-step modeling using a simple BERT-based classifier that leads to improvement in the overall performance of these models in this realistic experimental setup.
从文本提取关系tuple是一项建立知识库的重要任务。最近,联合实体和关系提取模型在这项任务中取得了非常高的F1得分。然而,这些模型使用的实验设置是限制性的,并且实验中使用的数据集并不现实。它们不包括没有 tuple (零属性)的语句。在本文中,我们在一个更现实的环境中评估最先进的联合实体和关系提取模型。我们在实验中包括没有 tuple 的语句。我们的实验结果表明,在这些模型在这种现实环境中的F1得分上出现了显著下降(一个数据集下降 $sim 10-15\%$,另一个数据集下降 $sim 6-14\%$)。我们还提出了使用一个简单的BERT基于分类器的两个步骤建模方法,这有助于在这些模型在这种现实实验环境中的总体表现上提高。
https://arxiv.org/abs/2302.09887
Document-level relation extraction (DocRE) is the task of identifying all relations between each entity pair in a document. Evidence, defined as sentences containing clues for the relationship between an entity pair, has been shown to help DocRE systems focus on relevant texts, thus improving relation extraction. However, evidence retrieval (ER) in DocRE faces two major issues: high memory consumption and limited availability of annotations. This work aims at addressing these issues to improve the usage of ER in DocRE. First, we propose DREEAM, a memory-efficient approach that adopts evidence information as the supervisory signal, thereby guiding the attention modules of the DocRE system to assign high weights to evidence. Second, we propose a self-training strategy for DREEAM to learn ER from automatically-generated evidence on massive data without evidence annotations. Experimental results reveal that our approach exhibits state-of-the-art performance on the DocRED benchmark for both DocRE and ER. To the best of our knowledge, DREEAM is the first approach to employ ER self-training.
文档级别的关系提取(DocRE)的任务是确定文档中每个实体对之间的所有关系。证据定义为包含关系两个实体之间线索的语句。已经证明, DocRE 系统可以重点关注相关的文本,从而改善关系提取。然而,在 DocRE 中证据检索(ER)面临两个主要问题:高内存消耗和有限的标注资源。这项工作旨在解决这些问题,以改善在 DocRE 中使用 ER 的方式。第一个方法是我们提出的 DREEAM,采用证据信息作为监督信号,从而指导 DocRE 系统的注意力模块将证据赋予高权重。第二个方法是我们提出的自我训练策略,为 DREEAM 学习 ER 从没有证据标注的大规模数据中自动生成的证据。实验结果表明,我们的方法和 DocRED 基准在 DocRE 和 ER 方面都表现出最先进的性能。据我们所知,DREEAM 是采用 ER 自我训练的第一种方法。
https://arxiv.org/abs/2302.08675
Existing models to extract temporal relations between events lack a principled method to incorporate external knowledge. In this study, we introduce Bayesian-Trans, a Bayesian learning-based method that models the temporal relation representations as latent variables and infers their values via Bayesian inference and translational functions. Compared to conventional neural approaches, instead of performing point estimation to find the best set parameters, the proposed model infers the parameters' posterior distribution directly, enhancing the model's capability to encode and express uncertainty about the predictions. Experimental results on the three widely used datasets show that Bayesian-Trans outperforms existing approaches for event temporal relation extraction. We additionally present detailed analyses on uncertainty quantification, comparison of priors, and ablation studies, illustrating the benefits of the proposed approach.
现有的模型用于提取事件之间的时间关系缺乏一种原则性的方法,以整合外部知识。在本研究中,我们介绍了贝叶斯转换(Bayesian-Trans),一种基于贝叶斯学习的模型,将时间关系表示为隐变量,并通过贝叶斯推理和翻译函数推断其值。与传统的神经网络方法相比, proposed 模型不再进行点估计,而是直接推断参数的后验分布,增强了模型对预测的不确定性编码和表达能力。对三个广泛应用的数据集的实验结果表明,贝叶斯转换在事件时间关系提取方面比现有方法更有效。我们还提供了对不确定性量化、比较先前知识和烧灼研究等方面的详细分析,以说明该方法的优势。
https://arxiv.org/abs/2302.04985
Specialised pre-trained language models are becoming more frequent in NLP since they can potentially outperform models trained on generic texts. BioBERT and BioClinicalBERT are two examples of such models that have shown promise in medical NLP tasks. Many of these models are overparametrised and resource-intensive, but thanks to techniques like Knowledge Distillation (KD), it is possible to create smaller versions that perform almost as well as their larger counterparts. In this work, we specifically focus on development of compact language models for processing clinical texts (i.e. progress notes, discharge summaries etc). We developed a number of efficient lightweight clinical transformers using knowledge distillation and continual learning, with the number of parameters ranging from 15 million to 65 million. These models performed comparably to larger models such as BioBERT and ClinicalBioBERT and significantly outperformed other compact models trained on general or biomedical data. Our extensive evaluation was done across several standard datasets and covered a wide range of clinical text-mining tasks, including Natural Language Inference, Relation Extraction, Named Entity Recognition, and Sequence Classification. To our knowledge, this is the first comprehensive study specifically focused on creating efficient and compact transformers for clinical NLP tasks. The models and code used in this study can be found on our Huggingface profile at this https URL and Github page at this https URL, respectively, promoting reproducibility of our results.
专业预训练语言模型在自然语言处理任务中越来越常见,因为它们有可能比基于一般文本的训练模型表现更好。BioBERT和BioClinicalBERT是两个在医学NLP任务中表现出前景的此类模型的例子。这些模型往往过度参数化且资源密集型,但得益于知识蒸馏(KD)等技术,可以创建小型版本,其表现几乎与大型版本相同。在本研究中,我们特别关注开发紧凑的语言模型来处理临床文本(例如进展记录、出院小结等)。我们利用知识蒸馏和持续学习开发了许多高效的轻量化临床Transformer,其参数数量从1.5亿到6.5亿。这些模型与大型模型如BioBERT和BioClinicalBioBERT表现相似,并显著优于其他基于一般或生物医学数据的紧凑模型。我们的广泛评估涉及多个标准数据集,涵盖了广泛的临床文本挖掘任务,包括自然语言推理、关系提取、命名实体识别和序列分类。据我们所知,这是第一个专门关注创建高效和紧凑的Transformers,以处理临床NLP任务的研究。这些模型和代码在本研究中可用,分别位于我们的Hugging Face个人页面上,以及GitHub页面上,促进我们结果的重复利用。
https://arxiv.org/abs/2302.04725
Current approaches for clinical information extraction are inefficient in terms of computational costs and memory consumption, hindering their application to process large-scale electronic health records (EHRs). We propose an efficient end-to-end model, the Joint-NER-RE-Fourier (JNRF), to jointly learn the tasks of named entity recognition and relation extraction for documents of variable length. The architecture uses positional encoding and unitary batch sizes to process variable length documents and uses a weight-shared Fourier network layer for low-complexity token mixing. Finally, we reach the theoretical computational complexity lower bound for relation extraction using a selective pooling strategy and distance-aware attention weights with trainable polynomial distance functions. We evaluated the JNRF architecture using the 2018 N2C2 ADE benchmark to jointly extract medication-related entities and relations in variable-length EHR summaries. JNRF outperforms rolling window BERT with selective pooling by 0.42%, while being twice as fast to train. Compared to state-of-the-art BiLSTM-CRF architectures on the N2C2 ADE benchmark, results show that the proposed approach trains 22 times faster and reduces GPU memory consumption by 1.75 folds, with a reasonable performance tradeoff of 90%, without the use of external tools, hand-crafted rules or post-processing. Given the significant carbon footprint of deep learning models and the current energy crises, these methods could support efficient and cleaner information extraction in EHRs and other types of large-scale document databases.
当前用于临床信息提取的方法在计算成本和内存消耗方面效率低下,限制了它们在处理大规模电子健康记录(EHRs)中的应用。我们提出了一种高效的端到端模型,即 Joint-NER-RE-Fourier(JNRF),用于同时学习长度可变文档中的命名实体识别和关系提取任务。该架构使用位置编码和单元批次大小处理长度可变文档,并使用共享权重的傅里叶网络层进行低复杂度 token 混合。最后,我们使用选择聚合策略和可训练的多项式距离函数的离群注意力权重来达到关系提取的理论计算复杂度 lower bound。我们使用 2018 N2C2 ADE 基准测试集,通过选择聚合来同时提取长度可变的 EHR 摘要中的药物相关实体和关系。JNRF 在选择聚合方面比滚动窗口BERT表现更好,比训练速度更快。与 2018 N2C2 ADE 基准测试集上最先进的 BiLSTM-CRF 架构相比,结果表明,我们提出的方法训练速度更快,GPU 内存消耗降低了 1.75 倍,合理的性能权衡为 90%,而不需要外部工具、手动制定规则或后处理。考虑到深度学习模型的巨大碳排放和当前的能源危机,这些方法可以支持 EHR 和其他类型大规模文档数据库中高效和清洁的信息提取。
https://arxiv.org/abs/2302.04185
The main purpose of relation extraction is to extract the semantic relationships between tagged pairs of entities in a sentence, which plays an important role in the semantic understanding of sentences and the construction of knowledge graphs. In this paper, we propose that the key semantic information within a sentence plays a key role in the relationship extraction of entities. We propose the hypothesis that the key semantic information inside the sentence plays a key role in entity relationship extraction. And based on this hypothesis, we split the sentence into three segments according to the location of the entity from the inside of the sentence, and find the fine-grained semantic features inside the sentence through the intra-sentence attention mechanism to reduce the interference of irrelevant noise information. The proposed relational extraction model can make full use of the available positive semantic information. The experimental results show that the proposed relation extraction model improves the accuracy-recall curves and P@N values compared with existing methods, which proves the effectiveness of this model.
关系提取的主要目的是从句子中提取实体之间的语义关系,这在句子的语义理解和知识图的构建中发挥着重要作用。在本文中,我们提出句子内部的关键语义信息在实体关系提取中发挥着关键作用。我们提出了假设,即句子内部的关键语义信息在实体关系提取中发挥着关键作用。基于这个假设,我们根据句子中的实体位置将句子划分为三个部分,并通过句子内部注意力机制找到句子中的精细语义特征,以减少无关噪声信息的影响。我们提出的关系提取模型能够充分利用可用的积极语义信息。实验结果显示,与现有方法相比,我们提出的关系提取模型提高了精度回忆曲线和P@N值,证明了该模型的有效性。
https://arxiv.org/abs/2302.02078
Pretrained language models such as Bidirectional Encoder Representations from Transformers (BERT) have achieved state-of-the-art performance in natural language processing (NLP) tasks. Recently, BERT has been adapted to the biomedical domain. Despite the effectiveness, these models have hundreds of millions of parameters and are computationally expensive when applied to large-scale NLP applications. We hypothesized that the number of parameters of the original BERT can be dramatically reduced with minor impact on performance. In this study, we present Bioformer, a compact BERT model for biomedical text mining. We pretrained two Bioformer models (named Bioformer8L and Bioformer16L) which reduced the model size by 60% compared to BERTBase. Bioformer uses a biomedical vocabulary and was pre-trained from scratch on PubMed abstracts and PubMed Central full-text articles. We thoroughly evaluated the performance of Bioformer as well as existing biomedical BERT models including BioBERT and PubMedBERT on 15 benchmark datasets of four different biomedical NLP tasks: named entity recognition, relation extraction, question answering and document classification. The results show that with 60% fewer parameters, Bioformer16L is only 0.1% less accurate than PubMedBERT while Bioformer8L is 0.9% less accurate than PubMedBERT. Both Bioformer16L and Bioformer8L outperformed BioBERTBase-v1.1. In addition, Bioformer16L and Bioformer8L are 2-3 fold as fast as PubMedBERT/BioBERTBase-v1.1. Bioformer has been successfully deployed to PubTator Central providing gene annotations over 35 million PubMed abstracts and 5 million PubMed Central full-text articles. We make Bioformer publicly available via this https URL, including pre-trained models, datasets, and instructions for downstream use.
预训练的语言模型,例如双向编码器表示从Transformer(BERT)已经实现了在自然语言处理(NLP)任务中的最先进的性能。最近,BERT已经适应生物医学领域。尽管这些模型具有数百数百万参数,但在应用于大规模NLP应用时计算成本很高。我们假设,原始的BERT参数数量可以戏剧性地减少,而性能的影响较小。在本研究中,我们介绍了Bioformer,一个紧凑的BERT模型,用于生物医学文本挖掘。我们预先训练了两个Bioformer模型(称为Bioformer8L和Bioformer16L),比BERTBase大小缩小了60%。Bioformer使用生物医学词汇表,从 scratch 开始进行预训练,在 PubMed 摘要和 PubMed Central 全文 articles 上。我们彻底评估了Bioformer以及包括BioBERT和PubMedBERT在内的现有生物医学BERT模型,包括BioBERT和PubMedBERT在15个基准数据集上的四种不同生物医学NLP任务:命名实体识别、关系提取、问题回答和文档分类的性能。结果显示,与 PubMedBERT 相比,Bioformer16L只准确性下降了0.1%,而 Bioformer8L 下降了0.9%。两者都超越了 BERTBase-v1.1。此外,Bioformer16L 和 Bioformer8L 的速度是 PubMedBERT/BioBERTBase-v1.1 的2-3倍。Bioformer已经成功部署到PubTator Central,提供超过350万 PubMed 摘要和500万 PubMed Central 全文 article的基因注释。我们将Bioformer通过这个httpsURL公开提供,包括预训练模型、数据集和后续使用的指令。
https://arxiv.org/abs/2302.01588
Automatic causal graph construction is of high importance in medical research. They have many applications, such as clinical trial criteria design, where identification of confounding variables is a crucial step. The quality bar for clinical applications is high, and the lack of public corpora is a barrier for such studies. Large language models (LLMs) have demonstrated impressive capabilities in natural language processing and understanding, so applying such models in clinical settings is an attractive direction, especially in clinical applications with complex relations between entities, such as diseases, symptoms and treatments. Whereas, relation extraction has already been studied using LLMs, here we present an end-to-end machine learning solution of causal relationship analysis between aforementioned entities using EMR notes. Additionally, in comparison to other studies, we demonstrate extensive evaluation of the method.
自动因果关系图构建在医学研究中非常重要。它们有许多应用,例如临床试验标准设计,其中确定混淆变量是一个重要的步骤。临床应用的质量和标准很高,缺乏公共数据是此类研究的障碍。大型语言模型(LLMs)在自然语言处理和理解方面已经表现出令人印象深刻的能力,因此将这类模型应用于临床环境是一个有吸引力的方向,特别是在涉及实体之间复杂关系的应用,如疾病、症状和治疗方法。相比之下,关系提取已经使用LLMs进行研究,现在我们将使用EMR笔记提出一种使用这种方法进行因果关系分析的端到端机器学习解决方案。此外,与其他人的研究相比,我们证明了这种方法的全面评估。
https://arxiv.org/abs/2301.12473
Objective: The 2022 n2c2 NLP Challenge posed identification of social determinants of health (SDOH) in clinical narratives. We present three systems that we developed for the Challenge and discuss the distinctive task formulation used in each of the three systems. Materials and Methods: The first system identifies target pieces of information independently using machine learning classifiers. The second system uses a large language model (LLM) to extract complete structured outputs per document. The third system extracts candidate phrases using machine learning and identifies target relations with hand-crafted rules. Results: The three systems achieved F1 scores of 0.884, 0.831, and 0.663 in the Subtask A of the Challenge, which are ranked third, seventh, and eighth among the 15 participating teams. The review of the extraction results from our systems reveals characteristics of each approach and those of the SODH extraction task. Discussion: Phrases and relations annotated in the task is unique and diverse, not conforming to the conventional event extraction task. These annotations are difficult to model with limited training data. The system that extracts information independently, ignoring the annotated relations, achieves the highest F1 score. Meanwhile, LLM with its versatile capability achieves the high F1 score, while respecting the annotated relations. The rule-based system tackling relation extraction obtains the low F1 score, while it is the most explainable approach. Conclusion: The F1 scores of the three systems vary in this challenge setting, but each approach has advantages and disadvantages in a practical application. The selection of the approach depends not only on the F1 score but also on the requirements in the application.
目标:2022年n2c2NLP挑战要求在临床日志中识别社交健康决定因素(SDOH)。我们介绍了为挑战开发的三个系统,并讨论了每个系统中使用的独特的任务 formulation。材料和方法:第一个系统使用机器学习分类器独立地识别目标信息。第二个系统使用大型语言模型(LLM)以每文档提取完整的结构化输出。第三个系统使用机器学习提取候选人短语,并使用手工制定的规则识别目标关系。结果:三个系统在挑战任务A的评分为0.884、0.831和0.663,在15个参与团队中排名第三、第七和第八。我们对系统提取结果的分析表明每个方法的特点以及SDOH提取任务的特点。讨论:任务中注释的短语和关系是独特的、多样化的,不符合传统的事件提取任务。这些注释在训练数据有限的情况下难以建模。独立提取信息的系统、忽略注释关系的系统获得最高的F1得分。同时,LLM因其多功能性获得了高F1得分,同时尊重注释关系。基于规则的系统处理关系提取获得的较低的F1得分,但它是解释性最好的方法。结论:在挑战设置中,三个系统的F1得分有所不同,但每种方法在实际应用中都有其优点和缺点。选择方法不仅取决于F1得分,还取决于应用程序的要求。
https://arxiv.org/abs/2301.11386
Interactions among humans on social media often convey intentions behind their actions, yielding a psychological language resource for Mental Health Analysis (MHA) of online users. The success of Computational Intelligence Techniques (CIT) for inferring mental illness from such social media resources points to NLP as a lens for causal analysis and perception mining. However, we argue that more consequential and explainable research is required for optimal impact on clinical psychology practice and personalized mental healthcare. To bridge this gap, we posit two significant dimensions: (1) Causal analysis to illustrate a cause and effect relationship in the user generated text; (2) Perception mining to infer psychological perspectives of social effects on online users intentions. Within the scope of Natural Language Processing (NLP), we further explore critical areas of inquiry associated with these two dimensions, specifically through recent advancements in discourse analysis. This position paper guides the community to explore solutions in this space and advance the state of practice in developing conversational agents for inferring mental health from social media. We advocate for a more explainable approach toward modeling computational psychology problems through the lens of language as we observe an increased number of research contributions in dataset and problem formulation for causal relation extraction and perception enhancements while inferring mental states.
社交媒体上的人类互动往往传达其行动背后的意图,生成的心理卫生分析(MHA)心理语言资源。计算智能技术(CIT)从这些社交媒体资源中推断精神健康问题的成功表明,自然语言处理(NLP)可以作为因果关系分析和感知挖掘的透镜。然而,我们认为,对于对临床心理学实践和个性化心理卫生的最佳影响,需要更多的有重要性且可解释的研究。为了填补这一差距,我们提出了两个重要的维度:(1)因果关系分析,以在用户生成文本中展示因果关系;(2)感知挖掘,以推断社交媒体对在线用户意图的心理效应。在自然语言处理(NLP)的范围内,我们进一步探索与这两个维度相关的 critical areas of inquiry,特别通过言语分析最近的进展。这篇论文指导社区在这个领域中探索解决方案,并推动开发从社交媒体推断心理卫生的实践。我们倡导一种更加可解释的方法,通过语言的视角建模计算心理学问题,因为我们观察到在数据集和问题制定中增加的研究贡献,以提取因果关系和增强感知,同时推断心理状态。
https://arxiv.org/abs/2301.11004
Zero-Shot Relation Extraction (ZRE) is the task of Relation Extraction where the training and test sets have no shared relation types. This very challenging domain is a good test of a model's ability to generalize. Previous approaches to ZRE reframed relation extraction as Question Answering (QA), allowing for the use of pre-trained QA models. However, this method required manually creating gold question templates for each new relation. Here, we do away with these gold templates and instead learn a model that can generate questions for unseen relations. Our technique can successfully translate relation descriptions into relevant questions, which are then leveraged to generate the correct tail entity. On tail entity extraction, we outperform the previous state-of-the-art by more than 16 F1 points without using gold question templates. On the RE-QA dataset where no previous baseline for relation extraction exists, our proposed algorithm comes within 0.7 F1 points of a system that uses gold question templates. Our model also outperforms the state-of-the-art ZRE baselines on the FewRel and WikiZSL datasets, showing that QA models no longer need template questions to match the performance of models specifically tailored to the ZRE task. Our implementation is available at this https URL.
Zero-Shot Relation Extraction (ZRE) 是关系提取任务,其中训练和测试集没有共享的关系类型。这是一个非常具有挑战性的领域,可以测试模型的泛化能力。之前对于 ZRE 的方法将其重新表述为问答(QA),允许使用预先训练的 QA 模型。然而,这种方法需要手动为每个新关系创建黄金问题模板。在这里,我们不再需要黄金问题模板,而是学习一种能够为未观察到的关系生成问题的新模型。我们的技术成功地将关系描述转换为相关问题,然后利用这些问题生成正确的长尾实体。在长尾实体提取方面,我们比先前最先进的方法多赢了 16 F1 点,而无需使用黄金问题模板。在缺少先前关系提取基准的 RE-QA 数据集上,我们提出的算法与使用黄金问题模板的系统相差不到 0.7 F1 点。我们的模型也在 FewRel 和 WikiZSL 数据集上比最先进的 ZRE 基准方法表现更好,这表明 QA 模型不再需要模板问题来匹配专门为 ZRE 任务定制的模型的性能。我们的实现可在 this https URL 上获取。
https://arxiv.org/abs/2301.09640