Abstract
Relation extraction (RE) is an important task that aims to identify the relationships between entities in texts. While large language models (LLMs) have revealed remarkable in-context learning (ICL) capability for general zero and few-shot learning, recent studies indicate that current LLMs still struggle with zero and few-shot RE. Previous studies are mainly dedicated to design prompt formats and select good examples for improving ICL-based RE. Although both factors are vital for ICL, if one can fundamentally boost the ICL capability of LLMs in RE, the zero and few-shot RE performance via ICL would be significantly improved. To this end, we introduce \textsc{Micre} (\textbf{M}eta \textbf{I}n-\textbf{C}ontext learning of LLMs for \textbf{R}elation \textbf{E}xtraction), a new meta-training framework for zero and few-shot RE where an LLM is tuned to do ICL on a diverse collection of RE datasets (i.e., learning to learn in context for RE). Through meta-training, the model becomes more effectively to learn a new RE task in context by conditioning on a few training examples with no parameter updates or task-specific templates at inference time, enabling better zero and few-shot task generalization. We experiment \textsc{Micre} on various LLMs with different model scales and 12 public RE datasets, and then evaluate it on unseen RE benchmarks under zero and few-shot settings. \textsc{Micre} delivers comparable or superior performance compared to a range of baselines including supervised fine-tuning and typical in-context learning methods. We find that the gains are particular significant for larger model scales, and using a diverse set of the meta-training RE datasets is key to improvements. Empirically, we show that \textsc{Micre} can transfer the relation semantic knowledge via relation label name during inference on target RE datasets.
Abstract (translated)
关系提取(RE)是识别文本中实体之间关系的重要任务。虽然大型语言模型(LLMs)已经在一般零和少样本学习方面展示了令人瞩目的表现,但最近的研究表明,当前的LLMs在零和少样本RE方面仍然存在困难。以前的研究主要是致力于设计提示格式和选择好的示例来提高基于ICL的RE。尽管这两点对ICL至关重要,但只要一个能根本性地提高LLM在RE中的ICL能力,零和少样本RE通过ICL的性能就会显著提高。因此,我们引入了\textsc{Micre}(LLMs为关系提取的元训练框架,\textbf{M}eta \textbf{I}n-\textbf{C}ontext learning of LLMs for \textbf{R}elation \textbf{E}xtraction),一种新的元训练框架,用于零和少样本RE,该框架使LLM在多样的RE数据集上进行关系提取(RE)时能够进行基于上下文的ICL学习(即在RE中学习上下文)。通过元训练,模型在推理时通过几组训练示例进行ICL学习,无需进行参数更新或任务特定模板,从而实现更好的零和少样本任务泛化。我们在各种LLM模型规模和12个公共RE数据集上进行实验,然后将\textsc{Micre}应用于各种LLM,并在零和少样本设置下评估其性能。与一系列基线相比,\textsc{Micre}在包括监督微调在内的各种方法中具有相似或卓越的性能。我们发现,对于较大的模型规模,收益尤为明显,而使用具有多样性的元训练RE数据集是提高改进的关键。通过实证研究,我们发现\textsc{Micre}可以通过关系标签名称在目标RE数据集上进行关系语义知识转移。
URL
https://arxiv.org/abs/2404.17807