Abstract
The process of knowledge acquisition can be viewed as a question-answer game between a student and a teacher in which the student typically starts by asking broad, open-ended questions before drilling down into specifics (Hintikka, 1981; Hakkarainen and Sintonen, 2002). This pedagogical perspective motivates a new way of representing documents. In this paper, we present SQUASH (Specificity-controlled Question-Answer Hierarchies), a novel and challenging text generation task that converts an input document into a hierarchy of question-answer pairs. Users can click on high-level questions (e.g., "Why did Frodo leave the Fellowship?") to reveal related but more specific questions (e.g., "Who did Frodo leave with?"). Using a question taxonomy loosely based on Lehnert (1978), we classify questions in existing reading comprehension datasets as either "general" or "specific". We then use these labels as input to a pipelined system centered around a conditional neural language model. We extensively evaluate the quality of the generated QA hierarchies through crowdsourced experiments and report strong empirical results.
Abstract (translated)
知识获取的过程可以被看作是学生和教师之间的一个问答游戏,在这种游戏中,学生通常先问广泛的、开放式的问题,然后再深入具体问题(Hintikka,1981年;Hakkarainen和Sintonen,2002年)。这种教学视角激发了一种新的文档表示方式。在本文中,我们提出了一种新的、具有挑战性的文本生成任务squash(特殊性控制的问答层次结构),它将输入文档转换为问答对的层次结构。用户可以点击高级问题(例如,“为什么佛罗多离开奖学金?”)揭示相关但更具体的问题(例如,“佛罗多和谁一起离开?”).使用基于Lehnert(1978)的问题分类法,我们将现有阅读理解数据集中的问题分类为“一般”或“特定”。然后,我们使用这些标签作为以条件神经语言模型为中心的流水线系统的输入。我们通过众包实验广泛地评估生成的QA层次结构的质量,并报告强有力的经验结果。
URL
https://arxiv.org/abs/1906.02622