Paper Reading AI Learner

Generating Query Focused Summaries without Fine-tuning the Transformer-based Pre-trained Models

2023-03-10 22:40:15
Deen Abdullah, Shamanth Nayak, Gandharv Suri, Yllias Chali

Abstract

Fine-tuning the Natural Language Processing (NLP) models for each new data set requires higher computational time associated with increased carbon footprint and cost. However, fine-tuning helps the pre-trained models adapt to the latest data sets; what if we avoid the fine-tuning steps and attempt to generate summaries using just the pre-trained models to reduce computational time and cost. In this paper, we tried to omit the fine-tuning steps and investigate whether the Marginal Maximum Relevance (MMR)-based approach can help the pre-trained models to obtain query-focused summaries directly from a new data set that was not used to pre-train the models. First, we used topic modelling on Wikipedia Current Events Portal (WCEP) and Debatepedia datasets to generate queries for summarization tasks. Then, using MMR, we ranked the sentences of the documents according to the queries. Next, we passed the ranked sentences to seven transformer-based pre-trained models to perform the summarization tasks. Finally, we used the MMR approach again to select the query relevant sentences from the generated summaries of individual pre-trained models and constructed the final summary. As indicated by the experimental results, our MMR-based approach successfully ranked and selected the most relevant sentences as summaries and showed better performance than the individual pre-trained models.

Abstract (translated)

对每个新数据集微调自然语言处理(NLP)模型需要更高的计算时间,与增加碳排放和成本相关。然而,微调有助于训练好的模型适应最新数据集;如果我们避免微调步骤并仅使用训练好的模型来生成摘要,可以减少计算时间和成本。在本文中,我们尝试省略微调步骤并研究是否存在基于marginal maximum relevance(MMR)的方法可以帮助训练好的模型从一个全新的数据集中提取针对提问的重点摘要,而该数据集并未用于训练模型。首先,我们使用维基百科当前事件 Portal(WCEP)和辩论pedia数据集来生成摘要任务的问题。然后,使用MMR,我们按提问顺序对文档中的语句进行评估。接着,我们将评估语句传递给七个基于Transformer的训练好的模型进行摘要任务。最后,我们再次使用MMR方法选择每个训练好的模型生成的针对提问的相关语句,并构建最终摘要。实验结果显示,我们的MMR方法成功地排名和选择了最相关的语句作为摘要,比单个训练好的模型表现出更好的性能。

URL

https://arxiv.org/abs/2303.06230

PDF

https://arxiv.org/pdf/2303.06230.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot