Abstract
Unsupervised relation extraction (URE) aims to extract relations between named entities from raw text without requiring manual annotations or pre-existing knowledge bases. In recent studies of URE, researchers put a notable emphasis on contrastive learning strategies for acquiring relation representations. However, these studies often overlook two important aspects: the inclusion of diverse positive pairs for contrastive learning and the exploration of appropriate loss functions. In this paper, we propose AugURE with both within-sentence pairs augmentation and augmentation through cross-sentence pairs extraction to increase the diversity of positive pairs and strengthen the discriminative power of contrastive learning. We also identify the limitation of noise-contrastive estimation (NCE) loss for relation representation learning and propose to apply margin loss for sentence pairs. Experiments on NYT-FB and TACRED datasets demonstrate that the proposed relation representation learning and a simple K-Means clustering achieves state-of-the-art performance.
Abstract (translated)
无监督关系提取(URE)旨在从原始文本中提取命名实体之间的关系,而不需要手动注释或预先存在的知识库。在URE recent studies中,研究人员对获得关系表示的对比学习策略给予了显著的关注。然而,这些研究往往忽视了两个重要的方面:包括对比学习中的多样正对和探索适当损失函数。在本文中,我们提出了一种增加积极对对多样性,并加强对比学习效果的方法:在句子内对成对进行增强,并通过跨句子对提取进行增强。我们还指出了NCE损失函数在关系表示学习中的局限性,并提出使用边缘损失来处理句子对。在NYT-FB和TACRED数据集上的实验表明,所提出的关系表示学习和简单的K-Means聚类达到了最先进的性能水平。
URL
https://arxiv.org/abs/2312.00552