Meta In-Context Learning Makes Large Language Models Better Zero and Few-Shot Relation Extractors

Abstract
Abstract (translated)
URL
PDF

Abstract

Relation extraction (RE) is an important task that aims to identify the relationships between entities in texts. While large language models (LLMs) have revealed remarkable in-context learning (ICL) capability for general zero and few-shot learning, recent studies indicate that current LLMs still struggle with zero and few-shot RE. Previous studies are mainly dedicated to design prompt formats and select good examples for improving ICL-based RE. Although both factors are vital for ICL, if one can fundamentally boost the ICL capability of LLMs in RE, the zero and few-shot RE performance via ICL would be significantly improved. To this end, we introduce \textsc{Micre} (\textbf{M}eta \textbf{I}n-\textbf{C}ontext learning of LLMs for \textbf{R}elation \textbf{E}xtraction), a new meta-training framework for zero and few-shot RE where an LLM is tuned to do ICL on a diverse collection of RE datasets (i.e., learning to learn in context for RE). Through meta-training, the model becomes more effectively to learn a new RE task in context by conditioning on a few training examples with no parameter updates or task-specific templates at inference time, enabling better zero and few-shot task generalization. We experiment \textsc{Micre} on various LLMs with different model scales and 12 public RE datasets, and then evaluate it on unseen RE benchmarks under zero and few-shot settings. \textsc{Micre} delivers comparable or superior performance compared to a range of baselines including supervised fine-tuning and typical in-context learning methods. We find that the gains are particular significant for larger model scales, and using a diverse set of the meta-training RE datasets is key to improvements. Empirically, we show that \textsc{Micre} can transfer the relation semantic knowledge via relation label name during inference on target RE datasets.

Abstract (translated)

关系提取（RE）是识别文本中实体之间关系的重要任务。虽然大型语言模型（LLMs）已经在一般零和少样本学习方面展示了令人瞩目的表现，但最近的研究表明，当前的LLMs在零和少样本RE方面仍然存在困难。以前的研究主要是致力于设计提示格式和选择好的示例来提高基于ICL的RE。尽管这两点对ICL至关重要，但只要一个能根本性地提高LLM在RE中的ICL能力，零和少样本RE通过ICL的性能就会显著提高。因此，我们引入了\textsc{Micre}（LLMs为关系提取的元训练框架，\textbf{M}eta \textbf{I}n-\textbf{C}ontext learning of LLMs for \textbf{R}elation \textbf{E}xtraction），一种新的元训练框架，用于零和少样本RE，该框架使LLM在多样的RE数据集上进行关系提取（RE）时能够进行基于上下文的ICL学习（即在RE中学习上下文）。通过元训练，模型在推理时通过几组训练示例进行ICL学习，无需进行参数更新或任务特定模板，从而实现更好的零和少样本任务泛化。我们在各种LLM模型规模和12个公共RE数据集上进行实验，然后将\textsc{Micre}应用于各种LLM，并在零和少样本设置下评估其性能。与一系列基线相比，\textsc{Micre}在包括监督微调在内的各种方法中具有相似或卓越的性能。我们发现，对于较大的模型规模，收益尤为明显，而使用具有多样性的元训练RE数据集是提高改进的关键。通过实证研究，我们发现\textsc{Micre}可以通过关系标签名称在目标RE数据集上进行关系语义知识转移。

URL

https://arxiv.org/abs/2404.17807

PDF

https://arxiv.org/pdf/2404.17807.pdf

Meta In-Context Learning Makes Large Language Models Better Zero and Few-Shot Relation Extractors

Abstract

Abstract (translated)

URL

PDF Copy

PDF