Paper Reading AI Learner

Stance Detection on Social Media with Fine-Tuned Large Language Models

2024-04-18 13:25:29
İlker Gül, Rémi Lebret, Karl Aberer

Abstract

Stance detection, a key task in natural language processing, determines an author's viewpoint based on textual analysis. This study evaluates the evolution of stance detection methods, transitioning from early machine learning approaches to the groundbreaking BERT model, and eventually to modern Large Language Models (LLMs) such as ChatGPT, LLaMa-2, and Mistral-7B. While ChatGPT's closed-source nature and associated costs present challenges, the open-source models like LLaMa-2 and Mistral-7B offers an encouraging alternative. Initially, our research focused on fine-tuning ChatGPT, LLaMa-2, and Mistral-7B using several publicly available datasets. Subsequently, to provide a comprehensive comparison, we assess the performance of these models in zero-shot and few-shot learning scenarios. The results underscore the exceptional ability of LLMs in accurately detecting stance, with all tested models surpassing existing benchmarks. Notably, LLaMa-2 and Mistral-7B demonstrate remarkable efficiency and potential for stance detection, despite their smaller sizes compared to ChatGPT. This study emphasizes the potential of LLMs in stance detection and calls for more extensive research in this field.

Abstract (translated)

姿态检测是自然语言处理中的一个关键任务,它通过文本分析来确定作者的观点。这项研究评估了姿态检测方法的演变,从早期的机器学习方法到突破性的BERT模型,最终到现代的大型语言模型(LLMs),如ChatGPT、LLLM-2和Mistral-7B。尽管ChatGPT的闭源性和相关成本带来了挑战,但像LLMa-2和Mistral-7B这样的开源模型仍然具有鼓舞人心的 alternative。最初,我们的研究专注于通过几个公开可用的数据集对ChatGPT、LLMa-2和Mistral-7B进行微调。随后,为了提供全面的比较,我们评估了这些模型在零散和少散学习场景下的性能。结果强调了LLMs在准确检测立场方面的非凡能力,所有测试模型都超过了现有基准。值得注意的是,LLMa-2和Mistral-7B展示了令人印象深刻的效率和立场检测潜力,尽管它们相对于ChatGPT来说较小。这项研究强调了LLMs在立场检测方面的潜力,并呼吁在這個領域进行更廣泛的研究。

URL

https://arxiv.org/abs/2404.12171

PDF

https://arxiv.org/pdf/2404.12171.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot