Paper Reading AI Learner

TurkishBERTweet: Fast and Reliable Large Language Model for Social Media Analysis


Abstract

Turkish is one of the most popular languages in the world. Wide us of this language on social media platforms such as Twitter, Instagram, or Tiktok and strategic position of the country in the world politics makes it appealing for the social network researchers and industry. To address this need, we introduce TurkishBERTweet, the first large scale pre-trained language model for Turkish social media built using almost 900 million tweets. The model shares the same architecture as base BERT model with smaller input length, making TurkishBERTweet lighter than BERTurk and can have significantly lower inference time. We trained our model using the same approach for RoBERTa model and evaluated on two text classification tasks: Sentiment Classification and Hate Speech Detection. We demonstrate that TurkishBERTweet outperforms the other available alternatives on generalizability and its lower inference time gives significant advantage to process large-scale datasets. We also compared our models with the commercial OpenAI solutions in terms of cost and performance to demonstrate TurkishBERTweet is scalable and cost-effective solution. As part of our research, we released TurkishBERTweet and fine-tuned LoRA adapters for the mentioned tasks under the MIT License to facilitate future research and applications on Turkish social media. Our TurkishBERTweet model is available at: this https URL

Abstract (translated)

土耳其是世界上使用最广泛的语之一。在Twitter、Instagram或Tiktok等社交媒体平台上广泛使用这种语言,以及土耳其在世界政治中的战略地位,使其对社交媒体研究员和产业具有吸引力。为满足这种需求,我们介绍了土耳其BERTweet,第一个基于几乎9亿条推文的土耳其社交媒体的大型预训练语言模型。该模型与基本BERT模型具有较小的输入长度,使得土耳其BERTweet比BERTurk更轻,可以在推理过程中显著降低。我们使用相同的方法对RoBERTa模型进行训练,并在两个文本分类任务上进行评估:情感分类和仇恨言论检测。我们证明,土耳其BERTweet在一般可解释性和较低的推理时间方面优于其他可用选项。我们还与商业OpenAI解决方案在成本和性能方面进行了比较,以证明土耳其BERTweet是一个可扩展和成本效益高的解决方案。作为我们的研究的一部分,我们在MIT许可证下发布了土耳其BERTweet,并对指定任务进行了微调,以促进未来对土耳其社交媒体的研究和应用。我们的土耳其BERTweet模型可以从以下链接获取:https://this URL

URL

https://arxiv.org/abs/2311.18063

PDF

https://arxiv.org/pdf/2311.18063.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot