Paper Reading AI Learner

Automating the Analysis of Public Saliency and Attitudes towards Biodiversity from Digital Media

2024-05-02 08:28:25
Noah Giebink, Amrita Gupta, Diogo Verìssimo, Charlotte H. Chang, Tony Chang, Angela Brennan, Brett Dickson, Alex Bowmer, Jonathan Baillie

Abstract

Measuring public attitudes toward wildlife provides crucial insights into our relationship with nature and helps monitor progress toward Global Biodiversity Framework targets. Yet, conducting such assessments at a global scale is challenging. Manually curating search terms for querying news and social media is tedious, costly, and can lead to biased results. Raw news and social media data returned from queries are often cluttered with irrelevant content and syndicated articles. We aim to overcome these challenges by leveraging modern Natural Language Processing (NLP) tools. We introduce a folk taxonomy approach for improved search term generation and employ cosine similarity on Term Frequency-Inverse Document Frequency vectors to filter syndicated articles. We also introduce an extensible relevance filtering pipeline which uses unsupervised learning to reveal common topics, followed by an open-source zero-shot Large Language Model (LLM) to assign topics to news article titles, which are then used to assign relevance. Finally, we conduct sentiment, topic, and volume analyses on resulting data. We illustrate our methodology with a case study of news and X (formerly Twitter) data before and during the COVID-19 pandemic for various mammal taxa, including bats, pangolins, elephants, and gorillas. During the data collection period, up to 62% of articles including keywords pertaining to bats were deemed irrelevant to biodiversity, underscoring the importance of relevance filtering. At the pandemic's onset, we observed increased volume and a significant sentiment shift toward horseshoe bats, which were implicated in the pandemic, but not for other focal taxa. The proposed methods open the door to conservation practitioners applying modern and emerging NLP tools, including LLMs "out of the box," to analyze public perceptions of biodiversity during current events or campaigns.

Abstract (translated)

衡量公众对野生动物的态度为我们与自然的关系提供了关键见解,并有助于监测全球生物多样性框架目标的实现。然而,在全球范围内进行此类评估具有挑战性。手动策展关键词以进行新闻和社交媒体搜索是乏味、耗时且可能导致偏见结果的。从查询中返回的新闻和社交媒体数据通常充满无关内容和高尔顿文章。我们希望通过利用现代自然语言处理(NLP)工具来克服这些挑战。我们引入了一种民间分类学方法来改进搜索词生成,并使用余弦相似度在词频-逆文档频率向量上过滤 syndicated 文章。我们还引入了一个可扩展的相关过滤管道,使用无监督学习来揭示共同主题,然后使用开源零击大语言模型(LLM)将主题分配给新闻文章标题,这些标题随后用于确定相关性。最后,我们对结果数据进行情感、主题和数量分析。我们用蝙蝠、穿山甲、大象和刚果黑猩猩等各种哺乳动物类群在COVID-19疫情前和疫情期间的新闻和社交媒体数据进行案例研究,来说明我们的方法。在数据收集期间,包括与蝙蝠关键词相关的文章,有高达62%的文章被认为与生物多样性无关,这凸显了相关性过滤的重要性。在疫情初期,我们观察到穿山甲数量增加和情感倾向明显向穿山甲倾斜,这些穿山甲被认为是导致疫情的原因,但并非其他关键类群。所提出的方法为 conservation practitioners 在当前事件或活动中应用现代和新兴 NLP 工具(包括LLM "out of the box")分析公众对生物多样性

URL

https://arxiv.org/abs/2405.01610

PDF

https://arxiv.org/pdf/2405.01610.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot