Paper Reading AI Learner

ChatGPT: Beginning of an End of Manual Linguistic Data Annotation? Use Case of Automatic Genre Identification

2023-03-08 09:35:09
Taja Kuzman, Igor Mozetič, Nikola Ljubešić

Abstract

ChatGPT has shown strong capabilities in natural language generation tasks, which naturally leads researchers to explore where its abilities end. In this paper, we examine whether ChatGPT can be used for zero-shot text classification, more specifically, automatic genre identification. We compare ChatGPT with a multilingual XLM-RoBERTa language model that was fine-tuned on datasets, manually annotated with genres. The models are compared on test sets in two languages: English and Slovenian. Results show that ChatGPT outperforms the fine-tuned model when applied to the dataset which was not seen before by either of the models. Even when applied on Slovenian language as an under-resourced language, ChatGPT's performance is no worse than when applied to English. However, if the model is fully prompted in Slovenian, the performance drops significantly, showing the current limitations of ChatGPT usage on smaller languages. The presented results lead us to questioning whether this is the beginning of an end of laborious manual annotation campaigns even for smaller languages, such as Slovenian.

Abstract (translated)

ChatGPT在自然语言生成任务中表现出强大的能力,这自然地促使研究人员探索它的能力边界。在本文中,我们探讨了ChatGPT是否可以用于零经验文本分类,更具体地说,自动分类。我们比较了ChatGPT和一个在数据集上手动标注了多种类型的多语言XLM-RoBERTa语言模型。模型在两个语言:英语和斯洛文尼亚的语言测试集上进行比较。结果表明,当应用于未在两种模型中观察到的dataset时,ChatGPT的性能比优化模型更好。即使应用斯洛文尼亚语言作为资源匮乏的语言,ChatGPT的性能也没有恶化到与英语的性能相同。然而,如果模型完全在斯洛文尼亚语中启用,性能会显著下降,这表明ChatGPT在小语言(如斯洛文尼亚)中使用目前的限制。 presented results 促使我们质疑,即使对于像斯洛文尼亚这样的小语言,手动标注 campaigns 也可能已经到了尽头。

URL

https://arxiv.org/abs/2303.03953

PDF

https://arxiv.org/pdf/2303.03953.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot