IKDSumm: Incorporating Key-phrases into BERT for extractive Disaster Tweet Summarization

Abstract
Abstract (translated)
URL
PDF

Abstract

Online social media platforms, such as Twitter, are one of the most valuable sources of information during disaster events. Therefore, humanitarian organizations, government agencies, and volunteers rely on a summary of this information, i.e., tweets, for effective disaster management. Although there are several existing supervised and unsupervised approaches for automated tweet summary approaches, these approaches either require extensive labeled information or do not incorporate specific domain knowledge of disasters. Additionally, the most recent approaches to disaster summarization have proposed BERT-based models to enhance the summary quality. However, for further improved performance, we introduce the utilization of domain-specific knowledge without any human efforts to understand the importance (salience) of a tweet which further aids in summary creation and improves summary quality. In this paper, we propose a disaster-specific tweet summarization framework, IKDSumm, which initially identifies the crucial and important information from each tweet related to a disaster through key-phrases of that tweet. We identify these key-phrases by utilizing the domain knowledge (using existing ontology) of disasters without any human intervention. Further, we utilize these key-phrases to automatically generate a summary of the tweets. Therefore, given tweets related to a disaster, IKDSumm ensures fulfillment of the summarization key objectives, such as information coverage, relevance, and diversity in summary without any human intervention. We evaluate the performance of IKDSumm with 8 state-of-the-art techniques on 12 disaster datasets. The evaluation results show that IKDSumm outperforms existing techniques by approximately 2-79% in terms of ROUGE-N F1-score.

Abstract (translated)

在线社交媒体平台，如推特，在灾难事件中是最有价值的信息来源之一。因此，人道主义组织、政府机构和志愿者依赖对这些信息进行摘要，即推特，以进行有效的灾难管理。尽管已经有几种现有的监督和无监督的方法来自动化推特摘要，但这些方法要么需要广泛的标签信息，要么没有与灾害相关的特定领域知识。此外，最近的方法灾难摘要提出了基于BERT模型来提高摘要质量。但是，为了进一步提高性能，我们引入了不使用人类努力理解推特重要性(重要性)的方法，这有助于促进摘要的创作和提高摘要质量。在本文中，我们提出了一个灾难特定的推特摘要框架，IKDSumm，该框架通过推特关键词识别确定每个与灾害相关的推特的关键和重要信息。我们利用灾害领域的特定知识(使用现有本体论)不使用人类干预进行识别。进一步，我们利用这些关键词自动生成推特摘要。因此，给定与灾害相关的推特，IKDSumm确保满足摘要的关键目标，如信息覆盖、相关性和多样性，而不需要任何人类干预。我们使用8个最先进的技术方法对12个灾难数据集进行评估。评估结果显示，IKDSumm在ROUGE-N F1-score方面比现有技术高出大约2-79%。

URL

https://arxiv.org/abs/2305.11592

PDF

https://arxiv.org/pdf/2305.11592.pdf