Paper Reading AI Learner

MetaIE: Distilling a Meta Model from LLM for All Kinds of Information Extraction Tasks

2024-03-30 19:43:45
Letian Peng, Zilong Wang, Feng Yao, Zihan Wang, Jingbo Shang

Abstract

Information extraction (IE) is a fundamental area in natural language processing where prompting large language models (LLMs), even with in-context examples, cannot defeat small LMs tuned on very small IE datasets. We observe that IE tasks, such as named entity recognition and relation extraction, all focus on extracting important information, which can be formalized as a label-to-span matching. In this paper, we propose a novel framework MetaIE to build a small LM as meta-model by learning to extract "important information", i.e., the meta-understanding of IE, so that this meta-model can be adapted to all kind of IE tasks effectively and efficiently. Specifically, MetaIE obtains the small LM via a symbolic distillation from an LLM following the label-to-span scheme. We construct the distillation dataset via sampling sentences from language model pre-training datasets (e.g., OpenWebText in our implementation) and prompting an LLM to identify the typed spans of "important information". We evaluate the meta-model under the few-shot adaptation setting. Extensive results on 13 datasets from 6 IE tasks confirm that MetaIE can offer a better starting point for few-shot tuning on IE datasets and outperform other meta-models from (1) vanilla language model pre-training, (2) multi-IE-task pre-training with human annotations, and (3) single-IE-task symbolic distillation from LLM. Moreover, we provide comprehensive analyses of MetaIE, such as the size of the distillation dataset, the meta-model architecture, and the size of the meta-model.

Abstract (translated)

信息抽取(IE)是自然语言处理领域的一个基本领域,即使在大规模语言模型(LLMs)的监督下,甚至使用带有上下文例子的LLMs,也无法击败在非常小的IE数据集上微调的小型LLM。我们观察到,IE任务(如命名实体识别和关系提取)都关注于提取重要信息,这可以形式化为标签到句子的匹配。在本文中,我们提出了一个名为MetaIE的新框架,通过学习提取“重要信息”——即IE的元理解,将小LLM构建为元模型,以便适应各种IE任务有效地和高效地。具体来说,MetaIE通过从LLM跟随标签到句子的符号蒸馏中获得小LLM,然后通过从语言模型预训练数据(例如,在我们的实现中使用OpenWebText)采样句子并提示LLM识别重要信息的句柄来构建差分数据集。我们在少样本适应设置下评估元模型。从6个IE任务中的13个数据集中获得的大量结果证实,MetaIE可以在IE数据集上提供更好的起点,并在(1)普通语言模型预训练,(2)带人类注释的多IE任务预训练和(3)LLM的单IE任务符号差分方面优于其他元模型。此外,我们提供了对MetaIE的全面分析,包括差分数据集的大小、元模型架构和元模型的大小。

URL

https://arxiv.org/abs/2404.00457

PDF

https://arxiv.org/pdf/2404.00457.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot