Abstract
We present a study on sentence-level factuality and bias of news articles across domains. While prior work in NLP has mainly focused on predicting the factuality of article-level news reporting and political-ideological bias of news media, we investigated the effects of framing bias in factual reporting across domains so as to predict factuality and bias at the sentence level, which may explain more accurately the overall reliability of the entire document. First, we manually produced a large sentence-level annotated dataset, titled FactNews, composed of 6,191 sentences from 100 news stories by three different outlets, resulting in 300 news articles. Further, we studied how biased and factual spans surface in news articles from different media outlets and different domains. Lastly, a baseline model for factual sentence prediction was presented by fine-tuning BERT. We also provide a detailed analysis of data demonstrating the reliability of the annotation and models.
Abstract (translated)
我们提出了一项研究,涉及不同领域新闻报道中句子级别的事实性和偏见。虽然先前在自然语言处理领域中的工作主要关注预测文章级别的事实性和新闻媒体的政治意识形态偏见,但我们研究了不同领域新闻报道中框架偏见的影响,以预测句子级别的事实性和偏见,这可能更准确地解释整个文档的总体可靠性。首先,我们手动制作了一个大型句子级别的注释数据集,名为Fact News,由三个不同媒体来源编写的100个新闻故事生成了300篇文章。此外,我们研究了不同媒体来源和不同领域的新闻报道中偏见和事实的范围。最后,通过微调BERT,我们介绍了一个基础句子预测模型。我们还提供了数据详细分析,以证明注释和模型的可靠性。
URL
https://arxiv.org/abs/2301.11850