Abstract
Relation extraction (RE) is a standard information extraction task playing a major role in downstream applications such as knowledge discovery and question answering. Although decoder-only large language models are excelling in generative tasks, smaller encoder models are still the go to architecture for RE. In this paper, we revisit fine-tuning such smaller models using a novel dual-encoder architecture with a joint contrastive and cross-entropy loss. Unlike previous methods that employ a fixed linear layer for predicate representations, our approach uses a second encoder to compute instance-specific predicate representations by infusing them with real entity spans from corresponding input instances. We conducted experiments on two biomedical RE datasets and two general domain datasets. Our approach achieved F1 score improvements ranging from 1% to 2% over state-of-the-art methods with a simple but elegant formulation. Ablation studies justify the importance of various components built into the proposed architecture.
Abstract (translated)
关系抽取(RE)是一项标准的信息提取任务,在知识发现和问答等下游应用中扮演着重要角色。尽管解码器-only的大语言模型在生成性任务上表现出色,但对于RE而言,较小的编码器模型仍然是首选架构。在这篇论文中,我们通过使用一种新型的双编码器架构并结合对比损失和交叉熵损失来重新审视对这些小型模型的微调过程。与之前的方法不同的是,我们的方法不再采用固定的线性层来进行谓词表示,而是引入了一个第二编码器,用于根据相应输入实例中的实际实体跨度生成特定于每个实例的谓词表示。 我们在两个生物医学RE数据集和两个通用领域数据集中进行了实验,结果表明我们提出的方法在F1分数上比现有最先进的方法提高了1%到2%,且具有简单而优雅的设计。消融研究也证明了所提出的架构中各个组件的重要性。
URL
https://arxiv.org/abs/2503.17799