Abstract
We introduce a meta dataset for few-shot relation extraction, which includes two datasets derived from existing supervised relation extraction datasets NYT29 (Takanobu et al., 2019; Nayak and Ng, 2020) and WIKIDATA (Sorokin and Gurevych, 2017) as well as a few-shot form of the TACRED dataset (Sabo et al., 2021). Importantly, all these few-shot datasets were generated under realistic assumptions such as: the test relations are different from any relations a model might have seen before, limited training data, and a preponderance of candidate relation mentions that do not correspond to any of the relations of interest. Using this large resource, we conduct a comprehensive evaluation of six recent few-shot relation extraction methods, and observe that no method comes out as a clear winner. Further, the overall performance on this task is low, indicating substantial need for future research. We release all versions of the data, i.e., both supervised and few-shot, for future research.
Abstract (translated)
我们提出了一个用于少样本关系提取的元数据集,其中包括来自现有监督关系提取数据集NYT29(Takanobu等人,2019;Nayak和Ng,2020)和WIKIDATA(Sorokin和Gurevych,2017)以及TACRED数据集中的少样本形式(Sabo等人,2021)。重要的是,所有这些少样本数据都是在现实假设下生成的,例如:测试关系与模型之前见过的任何关系不同,有限的数据训练,以及倾向不对应于感兴趣关系的候选关系注解。利用这个大量资源,我们对六个最近少样本关系提取方法进行了全面评估,观察到没有方法脱颖而出成为明确的优势。此外,这项任务的整体性能较低,表明需要进行大量的研究来提高。我们发布了所有数据版本,即监督和少样本数据,供未来研究。
URL
https://arxiv.org/abs/2404.04445