Event Relation Extraction (ERE) aims to extract multiple kinds of relations among events in texts. However, existing methods singly categorize event relations as different classes, which are inadequately capturing the intrinsic semantics of these relations. To comprehensively understand their intrinsic semantics, in this paper, we obtain prototype representations for each type of event relation and propose a Prototype-Enhanced Matching (ProtoEM) framework for the joint extraction of multiple kinds of event relations. Specifically, ProtoEM extracts event relations in a two-step manner, i.e., prototype representing and prototype matching. In the first step, to capture the connotations of different event relations, ProtoEM utilizes examples to represent the prototypes corresponding to these relations. Subsequently, to capture the interdependence among event relations, it constructs a dependency graph for the prototypes corresponding to these relations and utilized a Graph Neural Network (GNN)-based module for modeling. In the second step, it obtains the representations of new event pairs and calculates their similarity with those prototypes obtained in the first step to evaluate which types of event relations they belong to. Experimental results on the MAVEN-ERE dataset demonstrate that the proposed ProtoEM framework can effectively represent the prototypes of event relations and further obtain a significant improvement over baseline models.
事件关系提取(ERE)的目标是在文本中提取不同类型的关系。然而,现有方法单独将事件关系分类为不同的类别,这些类别未能充分捕捉到这些关系的内在语义。为了全面理解这些关系的内在语义,在本文中,我们提出了每个类型的事件关系原型表示,并提出了原型增强匹配(ProtoEM)框架,用于同时提取多种类型的事件关系。具体来说,ProtoEM采用两步提取方法,即原型表示和原型匹配。在第一步中,为了捕捉不同事件关系的内涵,ProtoEM使用示例表示这些关系的原型。随后,为了捕捉事件关系之间的依赖关系,它构建了一个原型依赖图,用于表示这些关系的原型,并使用基于Graph Neural Network(GNN)模块进行建模。在第二步中,它获取了新的事件对的表示,并计算它们与在第一步中获取的原型之间的相似性,以评估它们属于哪种事件关系。Maven-ERE数据集的实验结果表明,提出的ProtoEM框架可以 effectively representing 原型事件关系原型,并进一步优于基准模型。
https://arxiv.org/abs/2309.12892
This paper studies the problem of traffic flow forecasting, which aims to predict future traffic conditions on the basis of road networks and traffic conditions in the past. The problem is typically solved by modeling complex spatio-temporal correlations in traffic data using spatio-temporal graph neural networks (GNNs). However, the performance of these methods is still far from satisfactory since GNNs usually have limited representation capacity when it comes to complex traffic networks. Graphs, by nature, fall short in capturing non-pairwise relations. Even worse, existing methods follow the paradigm of message passing that aggregates neighborhood information linearly, which fails to capture complicated spatio-temporal high-order interactions. To tackle these issues, in this paper, we propose a novel model named Dynamic Hypergraph Structure Learning (DyHSL) for traffic flow prediction. To learn non-pairwise relationships, our DyHSL extracts hypergraph structural information to model dynamics in the traffic networks, and updates each node representation by aggregating messages from its associated hyperedges. Additionally, to capture high-order spatio-temporal relations in the road network, we introduce an interactive graph convolution block, which further models the neighborhood interaction for each node. Finally, we integrate these two views into a holistic multi-scale correlation extraction module, which conducts temporal pooling with different scales to model different temporal patterns. Extensive experiments on four popular traffic benchmark datasets demonstrate the effectiveness of our proposed DyHSL compared with a broad range of competing baselines.
这篇文章研究了交通流量预测问题,旨在基于道路网络和过去时期的交通条件预测未来交通状况。这个问题通常需要通过利用空间时间图神经网络(GNNs)建模交通数据的复杂空间时间关系来解决。然而,这些方法的性能仍然远远不足以满足要求,因为GNNs在处理复杂交通网络时通常具有有限的表示能力。Graphs本身无法捕捉非二元关系。更糟糕的是,现有的方法往往采用消息传递范式,以线性方式聚合邻居信息,这无法捕捉复杂的空间时间高级交互关系。为了解决这些问题,在本文中,我们提出了一种名为动态超图结构学习(DyHSL)的新模型,以用于交通流量预测。为了学习非二元关系,我们的DyHSL从超图结构信息中提取来建模交通网络的动态性,并更新每个节点的表示,通过聚合其相关的超边消息。此外,为了捕捉道路网络的高级别空间时间关系,我们引入了一个交互图卷积块,这进一步模型每个节点的邻居交互。最后,我们将这两个观点集成到一个整体多尺度关系提取模块中,以使用不同的尺度进行时间聚合来建模不同的时间模式。对四个常用的交通基准数据集进行广泛的实验表明,我们的 proposed DyHSL与广泛竞争的基准相比,具有卓越的效果。
https://arxiv.org/abs/2309.12028
Relation triple extraction (RTE) is an essential task in information extraction and knowledge graph construction. Despite recent advancements, existing methods still exhibit certain limitations. They just employ generalized pre-trained models and do not consider the specificity of RTE tasks. Moreover, existing tagging-based approaches typically decompose the RTE task into two subtasks, initially identifying subjects and subsequently identifying objects and relations. They solely focus on extracting relational triples from subject to object, neglecting that once the extraction of a subject fails, it fails in extracting all triples associated with that subject. To address these issues, we propose BitCoin, an innovative Bidirectional tagging and supervised Contrastive learning based joint relational triple extraction framework. Specifically, we design a supervised contrastive learning method that considers multiple positives per anchor rather than restricting it to just one positive. Furthermore, a penalty term is introduced to prevent excessive similarity between the subject and object. Our framework implements taggers in two directions, enabling triples extraction from subject to object and object to subject. Experimental results show that BitCoin achieves state-of-the-art results on the benchmark datasets and significantly improves the F1 score on Normal, SEO, EPO, and multiple relation extraction tasks.
关系三元提取(RTE)是在信息提取和知识图构建中不可或缺的任务。尽管最近取得了进展,但现有方法仍然表现出某些限制。它们只是使用泛化的预训练模型,并不考虑RTE任务的具体性质。此外,现有的标签基于的方法通常将RTE任务分解为两个子任务,最初确定主题并随后确定对象和关系。他们只是专注于从主题到对象提取关系三元,而忽略了一旦提取一个主题失败,它将失败提取与该主题相关的所有三元。为了解决这些问题,我们提出了比特 coin,一个创新的双向标签和监督Contrastive学习基于联合关系三元提取框架。具体来说,我们设计了一个监督Contrastive学习方法,考虑每个标签的多个积极值,而不是仅仅限制只有一个积极值。此外,我们引入了惩罚项,以防止主题和对象之间的过度相似性。我们的框架实现了两个方向的taggers,使可以从主题到对象和对象到主题提取三元。实验结果显示,比特 coin在基准数据集上取得了最先进的结果,并显著提高了正常、SEO、EPO和多个关系提取任务F1得分。
https://arxiv.org/abs/2309.11853
Recent span-based joint extraction models have demonstrated significant advantages in both entity recognition and relation extraction. These models treat text spans as candidate entities, and span pairs as candidate relationship tuples, achieving state-of-the-art results on datasets like ADE. However, these models encounter a significant number of non-entity spans or irrelevant span pairs during the tasks, impairing model performance significantly. To address this issue, this paper introduces a span-based multitask entity-relation joint extraction model. This approach employs the multitask learning to alleviate the impact of negative samples on entity and relation classifiers. Additionally, we leverage the Intersection over Union(IoU) concept to introduce the positional information into the entity classifier, achieving a span boundary detection. Furthermore, by incorporating the entity Logits predicted by the entity classifier into the embedded representation of entity pairs, the semantic input for the relation classifier is enriched. Experimental results demonstrate that our proposed this http URL model can effectively mitigate the adverse effects of excessive negative samples on the model performance. Furthermore, the model demonstrated commendable F1 scores of 73.61\%, 53.72\%, and 83.72\% on three widely employed public datasets, namely CoNLL04, SciERC, and ADE, respectively.
最近,基于跨度的联合实体和关系提取模型在实体识别和关系提取方面表现出了显著的优势。这些模型将文本跨度视为候选实体,并将跨度对作为候选关系元组,在类似ADE的数据集上取得了最先进的结果。然而,在这些任务中,这些模型会遇到大量非实体跨度或无关的跨度对,显著影响了模型性能。为了解决这个问题,本文介绍了基于跨度的多任务实体和关系提取模型。这种方法采用多任务学习来减轻负样本对实体和关系分类器的影响。此外,我们利用交集概念将位置信息引入实体分类器,实现了跨度边界检测。此外,通过将实体分类器预测的实体Logits嵌入到实体对的嵌入表示中,可以增加关系分类器的语义输入。实验结果显示,我们提出的这个http URL模型能够有效地减轻过度负样本对模型性能的不利影响。此外,该模型在三个广泛使用的公共数据集上(CoNLL04、SciERC和ADE)分别表现出令人赞叹的F1得分73.61%、53.72%和83.72%。
https://arxiv.org/abs/2309.09713
Contextual Relation Extraction (CRE) is mainly used for constructing a knowledge graph with a help of ontology. It performs various tasks such as semantic search, query answering, and textual entailment. Relation extraction identifies the entities from raw texts and the relations among them. An efficient and accurate CRE system is essential for creating domain knowledge in the biomedical industry. Existing Machine Learning and Natural Language Processing (NLP) techniques are not suitable to predict complex relations from sentences that consist of more than two relations and unspecified entities efficiently. In this work, deep learning techniques have been used to identify the appropriate semantic relation based on the context from multiple sentences. Even though various machine learning models have been used for relation extraction, they provide better results only for binary relations, i.e., relations occurred exactly between the two entities in a sentence. Machine learning models are not suited for complex sentences that consist of the words that have various meanings. To address these issues, hybrid deep learning models have been used to extract the relations from complex sentence effectively. This paper explores the analysis of various deep learning models that are used for relation extraction.
上下文关系提取(CRE)主要借助本体论帮助构建知识图谱,完成各种任务,例如语义搜索、问题回答和文本同义替换。关系提取从原始文本中识别实体及其之间的关系。一个高效、准确的CRE系统对于在生物医学领域中创造领域知识至关重要。现有的机器学习和自然语言处理技术不适合从包含超过两个关系和未指定实体的句子中高效预测复杂的关系。在本工作中,深度学习技术被用来从多个句子中提取合适的语义关系,基于多个句子的上下文。虽然各种机器学习模型都被用于关系提取,但它们只提供了二进制关系的结果,即句子中两个实体之间的关系。机器学习模型不适合包含多种含义的词语的复杂的句子。为了解决这些问题,混合深度学习模型被用于有效地提取复杂的句子中的关系。本文探讨了用于关系提取的各种深度学习模型的分析。
https://arxiv.org/abs/2309.06814
Electronic health records contain an enormous amount of valuable information, but many are recorded in free text. Information extraction is the strategy to transform the sequence of characters into structured data, which can be employed for secondary analysis. However, the traditional information extraction components, such as named entity recognition and relation extraction, require annotated data to optimize the model parameters, which has become one of the major bottlenecks in building information extraction systems. With the large language models achieving good performances on various downstream NLP tasks without parameter tuning, it becomes possible to use large language models for zero-shot information extraction. In this study, we aim to explore whether the most popular large language model, ChatGPT, can extract useful information from the radiological reports. We first design the prompt template for the interested information in the CT reports. Then, we generate the prompts by combining the prompt template with the CT reports as the inputs of ChatGPT to obtain the responses. A post-processing module is developed to transform the responses into structured extraction results. We conducted the experiments with 847 CT reports collected from Peking University Cancer Hospital. The experimental results indicate that ChatGPT can achieve competitive performances for some extraction tasks compared with the baseline information extraction system, but some limitations need to be further improved.
电子健康记录中包含大量有价值的信息,但很多记录在自由文本中。信息提取是一种将字符序列转换为结构化数据的策略,可以用于后续分析。然而,传统的信息提取组件,如命名实体识别和关系提取,需要注释数据来优化模型参数,已经成为构建信息提取系统的主要瓶颈之一。随着大型语言模型在多个下游自然语言处理任务中表现出良好的性能,使用大型语言模型进行零样本信息提取变得可能。在本研究中,我们旨在探索最受欢迎的大型语言模型ChatGPT是否能够从放射性报告中提取有用的信息。我们首先设计了一个 prompt template,用于筛选 CT 报告中的感兴趣的信息。然后,我们生成 prompts 并通过将 prompt template 与 CT 报告作为 ChatGPT 的输入来计算响应。一个 post-processing module 被开发来将响应转换为结构化提取结果。我们收集了从北京肿瘤医院847份 CT 报告的数据,进行了实验。实验结果表明,ChatGPT 在一些提取任务中相对于基准信息提取系统可以表现出竞争性能,但还有一些限制需要进一步改进。
https://arxiv.org/abs/2309.01398
Relation Extraction (RE) is the task of extracting semantic relationships between entities in a sentence and aligning them to relations defined in a vocabulary, which is generally in the form of a Knowledge Graph (KG) or an ontology. Various approaches have been proposed so far to address this task. However, applying these techniques to biomedical text often yields unsatisfactory results because it is hard to infer relations directly from sentences due to the nature of the biomedical relations. To address these issues, we present a novel technique called ReOnto, that makes use of neuro symbolic knowledge for the RE task. ReOnto employs a graph neural network to acquire the sentence representation and leverages publicly accessible ontologies as prior knowledge to identify the sentential relation between two entities. The approach involves extracting the relation path between the two entities from the ontology. We evaluate the effect of using symbolic knowledge from ontologies with graph neural networks. Experimental results on two public biomedical datasets, BioRel and ADE, show that our method outperforms all the baselines (approximately by 3\%).
关系提取(RE)的任务是在句子中提取语义关系,并将其与词汇表定义的关系对齐,通常以知识图(KG)或本体论的形式表示。迄今为止,已经提出了多种方法来解决这一任务。然而,将这些技术应用于生物医学文本往往无法得到令人满意的结果,因为生物医学关系的性质使得从句子中直接推断关系很困难。为了解决这些问题,我们提出了一种名为ReOnto的新技术,它使用神经符号知识来完成RE任务。ReOnto使用Graph Neural Networks来获取句子表示,利用公开可用的本体论作为前置知识,以识别两个实体之间的语义关系。方法涉及从本体中提取关系路径。我们评估了使用本体中符号知识与Graph Neural Networks的效果。在两个公共生物医学数据集BioRel和ADE的实验结果中,表明我们的方法比所有基准方法都表现更好(大约领先30%)。
https://arxiv.org/abs/2309.01370
Document-level relation extraction aims to identify relationships between entities within a document. Current methods rely on text-based encoders and employ various hand-coded pooling heuristics to aggregate information from entity mentions and associated contexts. In this paper, we replace these rigid pooling functions with explicit graph relations by leveraging the intrinsic graph processing capabilities of the Transformer model. We propose a joint text-graph Transformer model, and a graph-assisted declarative pooling (GADePo) specification of the input which provides explicit and high-level instructions for information aggregation. This allows the pooling process to be guided by domain-specific knowledge or desired outcomes but still learned by the Transformer, leading to more flexible and customizable pooling strategies. We extensively evaluate our method across diverse datasets and models, and show that our approach yields promising results that are comparable to those achieved by the hand-coded pooling functions.
文档级别的关系提取旨在识别文档中实体之间的关系。目前的方法依赖于基于文本的编码器,并使用各种手动编码的聚合启发式来从实体提及和相关上下文中聚合信息。在本文中,我们将这些僵化的聚合函数替换为明确的图形关系,利用Transformer模型固有的图形处理能力。我们提出了一种基于文本和图形的Transformer模型,并提出了一种图形辅助的declarative聚合(GADePo)输入规格,该规格提供了明确的高级别指令,以信息聚合。这使得聚合过程可以受到特定领域的知识或期望结果的指导,但仍然由Transformer学习,从而带来了更加灵活和可定制的聚合策略。我们广泛评估了不同数据集和模型,并表明,我们的方法取得了与手动编码的聚合函数相当有前途的结果。
https://arxiv.org/abs/2308.14423
Recently, Instruction fine-tuning has risen to prominence as a potential method for enhancing the zero-shot capabilities of Large Language Models (LLMs) on novel tasks. This technique has shown an exceptional ability to boost the performance of moderately sized LLMs, sometimes even reaching performance levels comparable to those of much larger model variants. The focus is on the robustness of instruction-tuned LLMs to seen and unseen tasks. We conducted an exploration of six models including Alpaca, Vicuna, WizardLM, and Traditional Task-oriented Models(Flan-T5-XL/XXL, T0++) using real-world relation extraction datasets as case studies. We carried out a comprehensive evaluation of these instruction-following LLMs which have been tuned based on open-domain instructions and task-oriented instructions. The main discussion is their performance and robustness towards instructions. We have observed that in most cases, the model's performance in dealing with unfamiliar instructions tends to worsen significantly, and the robustness of the model for RE instructions deteriorates compared to QA. Further, we discovered that up until a certain parameter size threshold (3B), the performance of the FLAN-T5 model improves as the parameter count increases. The robustness of different scales of FLAN-T5 models to RE instruction is worse than the robustness to QA instruction.
最近,指令微调已经成为增强大型语言模型(LLM)在新型任务中的零次预测能力的一种潜在方法。该技术展现出了增强中型LLM性能的出色能力,有时甚至能够性能水平与大型模型变种相当。我们关注指令微调LLM对可见和不可见任务的可靠性。我们使用现实世界的关系提取数据集研究了六个模型,包括袋鼠、vicuna、 WizardLM 和传统任务导向模型(Flan-T5-XL/XXL, T0++),使用开放指令和任务指令作为调优基础。我们进行了全面评估这些基于开放指令和任务指令微调的LLM。主要讨论的是它们对指令的性能和可靠性。我们观察到,在大多数情况下,模型在与不熟悉指令处理时的性能往往会显著恶化,而任务指令对LLM的可靠性相对于QA指令却有所恶化。此外,我们发现,直到某个参数大小阈值(3B)以下,Flan-T5模型的性能随着参数数量增加而改善。不同规模的Flan-T5模型对任务指令的可靠性比QA指令更差。
https://arxiv.org/abs/2308.14306
Joint entity and relation extraction is the fundamental task of information extraction, consisting of two subtasks: named entity recognition and relation extraction. Most existing joint extraction methods suffer from issues of feature confusion or inadequate interaction between two subtasks. In this work, we propose a Co-Attention network for joint entity and Relation Extraction (CARE). Our approach involves learning separate representations for each subtask, aiming to avoid feature overlap. At the core of our approach is the co-attention module that captures two-way interaction between two subtasks, allowing the model to leverage entity information for relation prediction and vice versa, thus promoting mutual enhancement. Extensive experiments on three joint entity-relation extraction benchmark datasets (NYT, WebNLG and SciERC) show that our proposed model achieves superior performance, surpassing existing baseline models.
联合实体和关系提取是信息提取的基本任务,包括两个子任务:命名实体识别和关系提取。大多数现有的联合提取方法都面临特征混淆或两个子任务之间交互不足的问题。在本文中,我们提出了一种联合实体和关系提取的共注意力网络(care)。我们的 approach 涉及为每个子任务学习独立的表示,以避免特征重叠。在 our approach 的核心是我们的共注意力模块,它捕捉两个子任务之间的双向交互,使模型可以利用实体信息进行关系预测,反之亦然,从而促进相互增强。我们对三个联合实体和关系提取基准数据集(NYT、WebNLG 和 SciERC)进行了广泛的实验,结果表明,我们提出的模型取得了更好的性能,超越了现有的基准模型。
https://arxiv.org/abs/2308.12531
Extracting relational triples (subject, predicate, object) from text enables the transformation of unstructured text data into structured knowledge. The named entity recognition (NER) and the relation extraction (RE) are two foundational subtasks in this knowledge generation pipeline. The integration of subtasks poses a considerable challenge due to their disparate nature. This paper presents a novel approach that converts the triple extraction task into a graph labeling problem, capitalizing on the structural information of dependency parsing and graph recursive neural networks (GRNNs). To integrate subtasks, this paper proposes a dynamic feedback forest algorithm that connects the representations of subtasks by inference operations during model training. Experimental results demonstrate the effectiveness of the proposed method.
提取关系三元(主题、谓词、对象)从文本使得将无结构文本数据转换为结构化知识得以实现。命名实体识别(NER)和关系提取(RE)是这个知识生成管道的基础任务的两个子任务。由于它们的不同性质,将它们整合起来是一项相当大的挑战。本文提出了一种新的方法来将三元提取任务转换为图标签问题,利用依赖解析和图递归神经网络(GRNNs)的结构信息。为了整合子任务,本文提出了一种动态反馈森林算法,在模型训练期间通过推理操作将子任务的表示连接起来。实验结果表明,该方法的有效性。
https://arxiv.org/abs/2308.11411
Relation Extraction (RE) is a pivotal task in automatically extracting structured information from unstructured text. In this paper, we present a multi-faceted approach that integrates representative examples and through co-set expansion. The primary goal of our method is to enhance relation classification accuracy and mitigating confusion between contrastive classes. Our approach begins by seeding each relationship class with representative examples. Subsequently, our co-set expansion algorithm enriches training objectives by incorporating similarity measures between target pairs and representative pairs from the target class. Moreover, the co-set expansion process involves a class ranking procedure that takes into account exemplars from contrastive classes. Contextual details encompassing relation mentions are harnessed via context-free Hearst patterns to ascertain contextual similarity. Empirical evaluation demonstrates the efficacy of our co-set expansion approach, resulting in a significant enhancement of relation classification performance. Our method achieves an observed margin of at least 1 percent improvement in accuracy in most settings, on top of existing fine-tuning approaches. To further refine our approach, we conduct an in-depth analysis that focuses on tuning contrastive examples. This strategic selection and tuning effectively reduce confusion between classes sharing similarities, leading to a more precise classification process. Experimental results underscore the effectiveness of our proposed framework for relation extraction. The synergy between co-set expansion and context-aware prompt tuning substantially contributes to improved classification accuracy. Furthermore, the reduction in confusion between contrastive classes through contrastive examples tuning validates the robustness and reliability of our method.
关系提取(RE)是自动从无结构文本中提取结构化信息的关键任务。在本文中,我们提出了一种多视角的方法,通过共同扩展来实现。我们的方法的主要目标是提高关系分类的准确性,并减轻对比类之间的混淆。我们的方法首先通过种子示例为每个关系类注入特征。随后,我们的共同扩展算法包括目标对和目标类的代表对之间的相似性度量,以增加训练目标。此外,共同扩展过程还包括一个类排名程序,考虑对比类的代表对。包含关系提及的上下文细节通过无上下文哈希模式 harnessed,以确定上下文相似性。经验评估证明了我们的共同扩展方法的有效性,导致关系分类性能的重大增强。在我们大多数设置中,我们的方法观察到至少1%的精度改进,超越了现有的微调方法。为了进一步改进我们的方法,我们进行了深入分析,重点是调整对比类的例子。这种战略选择和调整有效地减少了分享相似性的不同类之间的混淆,导致更精确的分类过程。实验结果强调了我们提出的关系提取框架的有效性。共同扩展和上下文意识prompttuning之间的协同作用显著促进了提高分类准确性。此外,通过对比类的例子调整来减少对比类之间的混淆验证了我们方法的稳健性和可靠性。
https://arxiv.org/abs/2308.11720
Biomedical Natural Language Processing (NLP) tends to become cumbersome for most researchers, frequently due to the amount and heterogeneity of text to be processed. To address this challenge, the industry is continuously developing highly efficient tools and creating more flexible engineering solutions. This work presents the integration between industry data engineering solutions for efficient data processing and academic systems developed for Named Entity Recognition (LasigeUnicage\_NER) and Relation Extraction (BiOnt). Our design reflects an integration of those components with external knowledge in the form of additional training data from other datasets and biomedical ontologies. We used this pipeline in the 2022 LitCoin NLP Challenge, where our team LasigeUnicage was awarded the 7th Prize out of approximately 200 participating teams, reflecting a successful collaboration between the academia (LASIGE) and the industry (Unicage). The software supporting this work is available at \url{this https URL}.
biomedical自然语言处理(NLP)对大部分研究人员来说往往会变得繁琐,主要原因是处理文本的数量和多样性。为了解决这一挑战,行业正在持续开发高效工具和创造更灵活的工程解决方案。本研究介绍了行业数据工程解决方案,用于高效数据处理和为命名实体识别(LasigeUnicage\_NER)和关系提取(BiOnt)开发的学术系统之间的集成。我们的设计反映了将这些组件与外部知识以其他数据集和生物医学术语的额外训练数据的形式进行集成。我们在2022年litcoin NLP挑战赛中使用了这个管道,我们的团队 LasigeUnicage 在大约200个参赛团队中赢得了第七名,反映了学术界(LASIGE)和行业(Unicage)之间的成功合作。支持这项工作的软件可以在\url{this https URL}上找到。
https://arxiv.org/abs/2308.05609
We present RadGraph2, a novel dataset for extracting information from radiology reports that focuses on capturing changes in disease state and device placement over time. We introduce a hierarchical schema that organizes entities based on their relationships and show that using this hierarchy during training improves the performance of an information extraction model. Specifically, we propose a modification to the DyGIE++ framework, resulting in our model HGIE, which outperforms previous models in entity and relation extraction tasks. We demonstrate that RadGraph2 enables models to capture a wider variety of findings and perform better at relation extraction compared to those trained on the original RadGraph dataset. Our work provides the foundation for developing automated systems that can track disease progression over time and develop information extraction models that leverage the natural hierarchy of labels in the medical domain.
我们呈现了RadGraph2,一个从放射学报告中提取信息的新颖数据集,重点关注捕捉疾病状态和设备位置随着时间的推移的变化。我们引入了基于关系的层级结构,并根据关系组织实体,并表明在训练期间使用这种层级结构可以提高信息提取模型的性能。具体来说,我们提出了对 DyGIE++框架进行修改的建议,导致我们的模型HGIE,它在实体和关系提取任务中比先前的模型表现更好。我们证明RadGraph2使模型能够捕捉更广泛的发现,并在关系提取任务中表现更好,与训练在原始RadGraph数据集上的模型相比。我们的工作为开发可以跟踪疾病进展和时间跟踪,并利用医学领域中标签的自然层次结构的自动化系统提供了基础。
https://arxiv.org/abs/2308.05046
Dialogue relation extraction (DRE) that identifies the relations between argument pairs in dialogue text, suffers much from the frequent occurrence of personal pronouns, or entity and speaker coreference. This work introduces a new benchmark dataset DialogRE^C+, introducing coreference resolution into the DRE scenario. With the aid of high-quality coreference knowledge, the reasoning of argument relations is expected to be enhanced. In DialogRE^C+ dataset, we manually annotate total 5,068 coreference chains over 36,369 argument mentions based on the existing DialogRE data, where four different coreference chain types namely speaker chain, person chain, location chain and organization chain are explicitly marked. We further develop 4 coreference-enhanced graph-based DRE models, which learn effective coreference representations for improving the DRE task. We also train a coreference resolution model based on our annotations and evaluate the effect of automatically extracted coreference chains demonstrating the practicality of our dataset and its potential to other domains and tasks.
对话关系提取(DRE)是指识别对话文本中论点 pairs 之间的关系,但经常会出现个人代词、实体和说话人之间的共指。这项工作介绍了一个新基准数据集DialogRE^C+,引入了共指解决机制,将其引入到DRE场景之中。利用高质量的共指知识,预计可以增强论点关系推理。在DialogRE^C+数据集中,我们手动标注了总共5,068个共指链,超过了36,369个论点提及,其中四种不同的共指链类型(说话人链、人链、地点链和组织链)被明确标记。我们还发展了4个共指增强的图DRE模型,它们学习有效的共指表示,以提高DRE任务。我们还在基于我们的标注训练一个共指解决模型,并评估了自动提取的共指链的影响,以展示我们数据集的实际意义以及它对其他领域和任务的潜在潜力。
https://arxiv.org/abs/2308.04498
We present FinTree, Financial Dataset Pretrain Transformer Encoder for Relation Extraction. Utilizing an encoder language model, we further pretrain FinTree on the financial dataset, adapting the model in financial domain tasks. FinTree stands out with its novel structure that predicts a masked token instead of the conventional [CLS] token, inspired by the Pattern Exploiting Training methodology. This structure allows for more accurate relation predictions between two given entities. The model is trained with a unique input pattern to provide contextual and positional information about the entities of interest, and a post-processing step ensures accurate predictions in line with the entity types. Our experiments demonstrate that FinTree outperforms on the REFinD, a large-scale financial relation extraction dataset. The code and pretrained models are available at this https URL.
我们提出了FinTree,一种用于关系提取的财务数据集预训练Transformer编码器。利用编码语言模型,我们进一步在财务数据集上预训练FinTree,使其在财务任务中适应。FinTree以其新颖的结构而闻名,它预测了掩码token而不是传统的[CLS]token,这受到了模式利用训练方法的启发。这种结构允许更准确的关系预测两个给定实体之间的关系。模型使用独特的输入模式来提供有关感兴趣的实体的上下文和位置信息,以及一个 post-processing 步以确保与实体类型相符的准确性预测。我们的实验结果表明,FinTree在REFinD上表现优异。代码和预训练模型可在 this https URL 中找到。
https://arxiv.org/abs/2307.13900
The Zero-Shot Learning (ZSL) task pertains to the identification of entities or relations in texts that were not seen during training. ZSL has emerged as a critical research area due to the scarcity of labeled data in specific domains, and its applications have grown significantly in recent years. With the advent of large pretrained language models, several novel methods have been proposed, resulting in substantial improvements in ZSL performance. There is a growing demand, both in the research community and industry, for a comprehensive ZSL framework that facilitates the development and accessibility of the latest methods and pretrained this http URL this study, we propose a novel ZSL framework called Zshot that aims to address the aforementioned challenges. Our primary objective is to provide a platform that allows researchers to compare different state-of-the-art ZSL methods with standard benchmark datasets. Additionally, we have designed our framework to support the industry with readily available APIs for production under the standard SpaCy NLP pipeline. Our API is extendible and evaluable, moreover, we include numerous enhancements such as boosting the accuracy with pipeline ensembling and visualization utilities available as a SpaCy extension.
零样本学习(ZSL)任务涉及在训练过程中未看到的文本实体或关系识别。ZSL由于特定领域的标记数据稀缺而成为一个关键研究领域,近年来其应用范围不断扩大。随着大型预训练语言模型的出现,提出了多个新的方法,导致ZSL性能的显著改善。研究社区和产业对 comprehensive ZSL框架的需求日益增加,该框架可以方便地开发最新的方法和预训练这个 http URL。此外,我们提出了一种名为Zshot的新ZSL框架,旨在解决上述挑战。我们的主要目标是提供一个平台,使研究人员能够比较不同先进的ZSL方法与传统基准数据集的标准比照。我们还设计了一个可扩展且可评估的API,以支持行业标准的SpaCyNLP pipeline生产。我们的API可以扩展和评估,并且包括许多增强功能,例如通过管道集成提高准确性,以及作为SpaCy扩展可用的可视化 utilities。
https://arxiv.org/abs/2307.13497
We evaluate four state-of-the-art instruction-tuned large language models (LLMs) -- ChatGPT, Flan-T5 UL2, Tk-Instruct, and Alpaca -- on a set of 13 real-world clinical and biomedical natural language processing (NLP) tasks in English, such as named-entity recognition (NER), question-answering (QA), relation extraction (RE), etc. Our overall results demonstrate that the evaluated LLMs begin to approach performance of state-of-the-art models in zero- and few-shot scenarios for most tasks, and particularly well for the QA task, even though they have never seen examples from these tasks before. However, we observed that the classification and RE tasks perform below what can be achieved with a specifically trained model for the medical field, such as PubMedBERT. Finally, we noted that no LLM outperforms all the others on all the studied tasks, with some models being better suited for certain tasks than others.
我们评估了四个先进的指令调整的大型语言模型(LLMs),包括ChatGPT、Flan-T5 UL2、Tk-Instruct和Alpaca,在13个英语真实临床和生物医学自然语言处理任务(NLP)中。这些任务的代表性如下:命名实体识别(NER)、问题回答(QA)、关系提取(RE)等。我们的总结果表明,这些评估的LLM在许多任务中开始接近最先进的模型的性能,特别是对于QA任务,尽管它们从未接触过这些任务的例子。然而,我们观察到分类和关系提取任务的表现低于针对医疗领域的专门训练模型,如PubMedBERT。最后,我们注意到,没有LLM在所有研究中的任务中胜过其他所有模型,某些模型更适合某些任务而不是其他任务。
https://arxiv.org/abs/2307.12114
Document-level joint entity and relation extraction is a challenging information extraction problem that requires a unified approach where a single neural network performs four sub-tasks: mention detection, coreference resolution, entity classification, and relation extraction. Existing methods often utilize a sequential multi-task learning approach, in which the arbitral decomposition causes the current task to depend only on the previous one, missing the possible existence of the more complex relationships between them. In this paper, we present a multi-task learning framework with bidirectional memory-like dependency between tasks to address those drawbacks and perform the joint problem more accurately. Our empirical studies show that the proposed approach outperforms the existing methods and achieves state-of-the-art results on the BioCreative V CDR corpus.
在文档级别,联合实体和关系提取是一项具有挑战性的信息提取问题,需要一种统一的方法,其中单个神经网络完成四个子任务:提及检测、关联解析、实体分类和关系提取。现有的方法通常使用Sequential Multi-Task Learning方法,该方法的随机分解导致当前任务仅依赖于先前任务,从而可能忽略了它们之间更复杂的关系。在本文中,我们提出了一种多任务学习框架,其中任务之间的双向记忆依赖机制能够解决这些缺点,并更准确地执行联合问题。我们的实验研究表明,该方法在BioCreative V CDR语料库中优于现有方法。
https://arxiv.org/abs/2307.11762
Mathematics is a highly specialized domain with its own unique set of challenges that has seen limited study in natural language processing. However, mathematics is used in a wide variety of fields and multidisciplinary research in many different domains often relies on an understanding of mathematical concepts. To aid researchers coming from other fields, we develop a prototype system for searching for and defining mathematical concepts in context, focusing on the field of category theory. This system, Parmesan, depends on natural language processing components including concept extraction, relation extraction, definition extraction, and entity linking. In developing this system, we show that existing techniques cannot be applied directly to the category theory domain, and suggest hybrid techniques that do perform well, though we expect the system to evolve over time. We also provide two cleaned mathematical corpora that power the prototype system, which are based on journal articles and wiki pages, respectively. The corpora have been annotated with dependency trees, lemmas, and part-of-speech tags.
数学是一个非常专业化的领域,它拥有自身的独特挑战,在自然语言处理领域中的研究相对较少。然而,数学在许多不同的领域都得到了广泛的应用,并且多学科研究在许多不同领域中往往依赖于对数学概念的理解。为了协助来自不同领域的研究人员,我们开发了一个原型系统,用于在上下文中搜索和定义数学概念,并重点聚焦于分类理论领域。该系统名叫Parmesan,依赖于自然语言处理组件,包括概念提取、关系提取、定义提取和实体链接。在开发该系统时,我们表明,现有技术不能直接应用于分类理论领域,并建议采用混合技术,尽管我们希望系统会随着时间的推移不断发展。我们还提供了两个清洁的数学语料库,用于支持原型系统,分别基于期刊文章和维基页面。语料库已被注释依赖树、lemma和部分语言标签。
https://arxiv.org/abs/2307.06699