DocRED-FE: A Document-Level Fine-Grained Entity And Relation Extraction Dataset

Abstract
Abstract (translated)
URL
PDF

Abstract

Joint entity and relation extraction (JERE) is one of the most important tasks in information extraction. However, most existing works focus on sentence-level coarse-grained JERE, which have limitations in real-world scenarios. In this paper, we construct a large-scale document-level fine-grained JERE dataset DocRED-FE, which improves DocRED with Fine-Grained Entity Type. Specifically, we redesign a hierarchical entity type schema including 11 coarse-grained types and 119 fine-grained types, and then re-annotate DocRED manually according to this schema. Through comprehensive experiments we find that: (1) DocRED-FE is challenging to existing JERE models; (2) Our fine-grained entity types promote relation classification. We make DocRED-FE with instruction and the code for our baselines publicly available at this https URL.

Abstract (translated)

联合实体和关系提取(JERE)是信息提取中的最重要任务之一。然而,大多数现有工作集中在句子级别的粗粒度JERE,在实际应用中存在一些限制。在本文中,我们建立了一个大规模的文档级别的精细粒度JERE数据集 DocRED-FE,以改进基于精细实体类型的DocRED。具体来说,我们重新设计了包括11个粗粒度类型和119个精细粒度类型的层级实体类型 schema,然后根据这个 schema 手动重新注释 DocRED。通过全面实验,我们发现:(1) DocRED-FE对现有的JERE模型具有挑战性;(2)我们的精细实体类型促进了关系分类。我们将 DocRED-FE与指令和我们的基准代码在此https URL上公开发布。

URL

https://arxiv.org/abs/2303.11141

PDF

https://arxiv.org/pdf/2303.11141.pdf

DocRED-FE: A Document-Level Fine-Grained Entity And Relation Extraction Dataset

Abstract

Abstract (translated)

URL

PDF Copy

PDF