Abstract
Existing approaches on zero-shot event detection usually train models on datasets annotated with known event types, and prompt them with unseen event definitions. These approaches yield sporadic successes, yet generally fall short of expectations. In this work, we aim to improve zero-shot event detection by training models to better follow event definitions. We hypothesize that a diverse set of event types and definitions are the key for models to learn to follow event definitions while existing event extraction datasets focus on annotating many high-quality examples for a few event types. To verify our hypothesis, we construct an automatically generated Diverse Event Definition (DivED) dataset and conduct comparative studies. Our experiments reveal that a large number of event types (200) and diverse event definitions can significantly boost event extraction performance; on the other hand, the performance does not scale with over ten examples per event type. Beyond scaling, we incorporate event ontology information and hard-negative samples during training, further boosting the performance. Based on these findings, we fine-tuned a LLaMA-2-7B model on our DivED dataset, yielding performance that surpasses SOTA large language models like GPT-3.5 across three open benchmarks on zero-shot event detection.
Abstract (translated)
现有的零样本事件检测方法通常在已知事件类型的数据集上训练模型,并对其进行未见过的事件定义的提示。这些方法产生了一些零散的成功,但通常未能达到预期效果。在这项工作中,我们旨在通过训练模型更好地遵循事件定义来提高零样本事件检测。我们假设,具有多样事件类型和定义的丰富数据集对于模型学习跟随事件定义至关重要。为了验证我们的假设,我们构建了一个自动生成的多样化事件定义(DivED)数据集,并进行了比较研究。我们的实验发现,大量事件类型(200)和多样事件定义可以显著提高事件提取性能;另一方面,性能并未随着每个事件类型的超过10个示例而扩展。超越扩展,我们在训练过程中引入了事件本体信息以及难负样本。这进一步提高了性能。基于这些发现,我们将DivED数据集上的LLaMA-2-7B模型进行了微调,在三个开放基准测试中实现了与SOTA大型语言模型GPT-3.5相当或更好的性能。
URL
https://arxiv.org/abs/2403.02586