Abstract
General purpose relation extractors, which can model arbitrary relations, are a core aspiration in information extraction. Efforts have been made to build general purpose extractors that represent relations with their surface forms, or which jointly embed surface forms with relations from an existing knowledge graph. However, both of these approaches are limited in their ability to generalize. In this paper, we build on extensions of Harris' distributional hypothesis to relations, as well as recent advances in learning text representations (specifically, BERT), to build task agnostic relation representations solely from entity-linked text. We show that these representations significantly outperform previous work on exemplar based relation extraction (FewRel) even without using any of that task's training data. We also show that models initialized with our task agnostic representations, and then tuned on supervised relation extraction datasets, significantly outperform the previous methods on SemEval 2010 Task 8, KBP37, and TACRED.
Abstract (translated)
通用关系抽取器是信息抽取中的核心任务,它可以对任意关系进行建模。已经努力构建通用的提取器,它表示与表面形式的关系,或者将表面形式与现有知识图中的关系一起嵌入。然而,这两种方法在概括能力上都是有限的。本文基于哈里斯分布假设对关系的扩展,以及近年来学习文本表示(特别是伯特)的进展,从实体链接文本出发,建立了任务不可知关系表示。我们表明,这些表示显著优于以前的基于实例的关系提取(fewrel)的工作,即使不使用任何该任务的训练数据。我们还表明,使用任务不可知表示初始化的模型,然后在监督关系提取数据集上进行调整,明显优于Semeval 2010任务8、KBP37和Tacred上的先前方法。
URL
https://arxiv.org/abs/1906.03158