Paper Reading AI Learner

Query-Utterance Attention with Joint modeling for Query-Focused Meeting Summarization

2023-03-08 10:21:45
Xingxian Liu, Bin Duan, Bo Xiao, Yajing Xu

Abstract

Query-focused meeting summarization (QFMS) aims to generate summaries from meeting transcripts in response to a given query. Previous works typically concatenate the query with meeting transcripts and implicitly model the query relevance only at the token level with attention mechanism. However, due to the dilution of key query-relevant information caused by long meeting transcripts, the original transformer-based model is insufficient to highlight the key parts related to the query. In this paper, we propose a query-aware framework with joint modeling token and utterance based on Query-Utterance Attention. It calculates the utterance-level relevance to the query with a dense retrieval module. Then both token-level query relevance and utterance-level query relevance are combined and incorporated into the generation process with attention mechanism explicitly. We show that the query relevance of different granularities contributes to generating a summary more related to the query. Experimental results on the QMSum dataset show that the proposed model achieves new state-of-the-art performance.

Abstract (translated)

Query focused meeting summarization (QFMS)旨在从会议记录中提取针对给定查询的摘要。以前的研究通常会将查询与会议记录并合起来,并使用注意力机制在 token 级别上隐含地建模查询相关性。然而,由于长期会议记录可能导致关键查询相关性的稀释,原基于Transformer 的模型不足以突出与查询相关的关键部分。在本文中,我们提出了基于 Query-Utterance Attention 的注意力 aware 框架,使用联合建模 token 和言论来进行摘要。它使用密度检索模块计算言论级别的查询相关性。然后将 token 级别的查询相关性和言论级别的查询相关性合并,并 explicitly 使用注意力机制进入生成过程。我们证明,不同粒度的查询相关性有助于生成更与查询相关的摘要。在 QMSum 数据集上的实验结果表明,该模型实现了新的前沿技术性能。

URL

https://arxiv.org/abs/2303.04487

PDF

https://arxiv.org/pdf/2303.04487.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot