Paper Reading AI Learner

Reap the Wild Wind: Detecting Media Storms in Large-Scale News Corpora

2024-04-14 16:47:38
Dror K. Markus, Effi Levi, Tamir Sheafer, Shaul R. Shenhav

Abstract

Media Storms, dramatic outbursts of attention to a story, are central components of media dynamics and the attention landscape. Despite their significance, there has been little systematic and empirical research on this concept due to issues of measurement and operationalization. We introduce an iterative human-in-the-loop method to identify media storms in a large-scale corpus of news articles. The text is first transformed into signals of dispersion based on several textual characteristics. In each iteration, we apply unsupervised anomaly detection to these signals; each anomaly is then validated by an expert to confirm the presence of a storm, and those results are then used to tune the anomaly detection in the next iteration. We demonstrate the applicability of this method in two scenarios: first, supplementing an initial list of media storms within a specific time frame; and second, detecting media storms in new time periods. We make available a media storm dataset compiled using both scenarios. Both the method and dataset offer the basis for comprehensive empirical research into the concept of media storms, including characterizing them and predicting their outbursts and durations, in mainstream media or social media platforms.

Abstract (translated)

媒体风暴,即对故事关注的高涨,是媒体动态和关注力格局的核心组成部分。尽管这一概念具有重要意义,但由於测量和操作问题,系统性和实证研究仍然很少。我们引入了一种迭代的人-反馈方法,以在大规模新闻文章语料库中识别媒体风暴。首先将文本转换为基于多个文本特征的扩散信号。在每一次迭代中,我们对这些信号应用无监督异常检测;然后由专家验证每个异常的存在,并使用这些结果对下一次迭代中的异常检测进行调整。我们展示了这种方法的两种应用场景:一是补充特定时间段内的媒体风暴列表;二是在新时间段内检测媒体风暴。我们还提供了这两种场景下的媒体风暴数据集。这种方法和数据集为全面实证研究媒体风暴的概念提供了基础,包括对其进行描述和预测爆发时间和持续时间,以及在主流媒体或社交媒体平台上。

URL

https://arxiv.org/abs/2404.09299

PDF

https://arxiv.org/pdf/2404.09299.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot