Paper Reading AI Learner

Which questions should I answer? Salience Prediction of Inquisitive Questions

2024-04-16 21:33:05
Yating Wu, Ritika Mangla, Alexandros G. Dimakis, Greg Durrett, Junyi Jessy Li

Abstract

Inquisitive questions -- open-ended, curiosity-driven questions people ask as they read -- are an integral part of discourse processing (Kehler and Rohde, 2017; Onea, 2016) and comprehension (Prince, 2004). Recent work in NLP has taken advantage of question generation capabilities of LLMs to enhance a wide range of applications. But the space of inquisitive questions is vast: many questions can be evoked from a given context. So which of those should be prioritized to find answers? Linguistic theories, unfortunately, have not yet provided an answer to this question. This paper presents QSALIENCE, a salience predictor of inquisitive questions. QSALIENCE is instruction-tuned over our dataset of linguist-annotated salience scores of 1,766 (context, question) pairs. A question scores high on salience if answering it would greatly enhance the understanding of the text (Van Rooy, 2003). We show that highly salient questions are empirically more likely to be answered in the same article, bridging potential questions (Onea, 2016) with Questions Under Discussion (Roberts, 2012). We further validate our findings by showing that answering salient questions is an indicator of summarization quality in news.

Abstract (translated)

好奇的问题 -- 开放性的、以好奇心为导向的问题,人们在阅读中提出的问题 -- 是语义处理(Kehler和Rohde,2017;Onea,2016)和理解(Prince,2004)的重要组成部分。近年来,自然语言处理(NLP)工作充分利用了大型语言模型的问句生成能力,增强了广泛的应用。但是,好奇的问题的空间是广阔的:可以从给定的上下文中引发许多问题。那么,应该优先考虑哪些问题来寻找答案呢?不幸的是,语言理论尚未回答这个问题。本文介绍了 QSALIENCE,一个好奇问题预测器。QSALIENCE 是通过我们数据集中的1766个(上下文,问题)对进行语言学家标注的语义分数进行指令调整的。问题得分高,如果回答它会大大增强对文本的理解(Van Rooy,2003)。我们证明了,高度耸人听闻的问题在实证上更有可能在相同的文章中被回答,将潜在问题(Onea,2016)与正在讨论的问题(Roberts,2012)联系起来。我们进一步验证了我们的研究结果,通过展示回答耸人听闻的问题是新闻摘要质量的指标,来进一步验证我们的发现。

URL

https://arxiv.org/abs/2404.10917

PDF

https://arxiv.org/pdf/2404.10917.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot