Abstract
Identifying and exploring emerging trends in the news is becoming more essential than ever with many changes occurring worldwide due to the global health crises. However, most of the recent research has focused mainly on detecting trends in social media, thus, benefiting from social features (e.g. likes and retweets on Twitter) which helped the task as they can be used to measure the engagement and diffusion rate of content. Yet, formal text data, unlike short social media posts, comes with a longer, less restricted writing format, and thus, more challenging. In this paper, we focus our study on emerging trends detection in financial news articles about Microsoft, collected before and during the start of the COVID-19 pandemic (July 2019 to July 2020). We make the dataset accessible and propose a strong baseline (Contextual Leap2Trend) for exploring the dynamics of similarities between pairs of keywords based on topic modelling and term frequency. Finally, we evaluate against a gold standard (Google Trends) and present noteworthy real-world scenarios regarding the influence of the pandemic on Microsoft.
Abstract (translated)
识别和探索新闻中的新兴趋势变得越来越重要,因为全球卫生危机导致了全球范围内的许多变化。然而,最近的研究大多数重点都放在社交媒体的趋势检测上,因此从社会 features (如推特上的喜欢和转推)中受益,这些 feature 可以帮助任务测量内容的互动和传播速率。然而,正式文本数据与简短的社交媒体帖子不同,它有一个更长、不受限制的写作格式,因此更具挑战性。在本文中,我们关注 Microsoft 相关的金融新闻文章中的新兴趋势检测,这些文章是在 COVID-19 疫情开始(2019 年 7 月至 2020 年 7 月)收集的。我们使数据变得可访问,并提出了强基线(Contextual Leap2Trend)以基于主题建模和关键词频率探索关键词之间的相似性动态。最后,我们与黄金标准(Google Trends)进行评估,并呈现关于疫情对 Microsoft 的影响的重要现实世界场景。
URL
https://arxiv.org/abs/2301.11318