Community Needs and Assets: A Computational Analysis of Community Conversations

2024-03-20 03:14:54
Md Towhidul Absar Chowdhury, Naveen Sharma, Ashiqur R. KhudaBukhsh


A community needs assessment is a tool used by non-profits and government agencies to quantify the strengths and issues of a community, allowing them to allocate their resources better. Such approaches are transitioning towards leveraging social media conversations to analyze the needs of communities and the assets already present within them. However, manual analysis of exponentially increasing social media conversations is challenging. There is a gap in the present literature in computationally analyzing how community members discuss the strengths and needs of the community. To address this gap, we introduce the task of identifying, extracting, and categorizing community needs and assets from conversational data using sophisticated natural language processing methods. To facilitate this task, we introduce the first dataset about community needs and assets consisting of 3,511 conversations from Reddit, annotated using crowdsourced workers. Using this dataset, we evaluate an utterance-level classification model compared to sentiment classification and a popular large language model (in a zero-shot setting), where we find that our model outperforms both baselines at an F1 score of 94% compared to 49% and 61% respectively. Furthermore, we observe through our study that conversations about needs have negative sentiments and emotions, while conversations about assets focus on location and entities. The dataset is available at this https URL.

Abstract (translated)

社区需求评估是一种由非营利组织和政府机构使用的工具,用于量化社区的优势和问题,使他们能够更有效地分配资源。这种方法正朝着利用社交媒体对话分析社区需求和已有资产的方向发展。然而,对指数级增长的社交媒体对话进行手动分析具有挑战性。目前文献中在计算分析社区成员如何讨论社区的优势和需求方面存在空白。为了填补这一空白,我们引入了从会话数据中识别、提取和分类社区需求和资产的任务,使用了先进的自然语言处理方法。为了方便这个任务,我们引入了第一个社区需求和资产的数据集,由Reddit上的3,511个对话组成,并通过民间工作者的标注。利用这个数据集,我们比较了基于语义级的分类模型与情感分类和流行的大语言模型(在零击环境中)的性能。我们发现,与基线相比,我们的模型在F1分数上分别比49%和61%更优异。此外,通过我们的研究,我们观察到关于需求 conversations 具有消极情感和情绪,而关于资产 conversations 则关注位置和实体。数据集可在这个链接处访问:



