Abstract
Existing event-centric NLP models often only apply to the pre-defined ontology, which significantly restricts their generalization capabilities. This paper presents CEO, a novel Corpus-based Event Ontology induction model to relax the restriction imposed by pre-defined event ontologies. Without direct supervision, CEO leverages distant supervision from available summary datasets to detect corpus-wise salient events and exploits external event knowledge to force events within a short distance to have close embeddings. Experiments on three popular event datasets show that the schema induced by CEO has better coverage and higher accuracy than previous methods. Moreover, CEO is the first event ontology induction model that can induce a hierarchical event ontology with meaningful names on eleven open-domain corpora, making the induced schema more trustworthy and easier to be further curated.
Abstract (translated)
现有的事件中心化自然语言处理模型通常只适用于预先定义的本体,这极大地限制了其泛化能力。本文介绍了CEO,一个基于 Corpus 的 Event Ontology Induction 模型,以放松预先定义事件本体的限制。在没有直接监督的情况下,CEO利用可用简要数据集的远程监督来检测 Corpus 中的显著事件,并利用外部事件知识来强制在极短距离内发生的事件具有靠近嵌入。对三个流行的事件数据集的实验表明,CEO 引起的 schema 比先前方法覆盖更广且更准确。此外,CEO是第一个能够在十一个开放域corpora上建立具有有意义的名称的层级事件本体的模型,从而使其引起的 schema 更加可信,更容易进一步 curated。
URL
https://arxiv.org/abs/2305.13521