ChatGPT: Beginning of an End of Manual Linguistic Data Annotation? Use Case of Automatic Genre Identification

Abstract
Abstract (translated)
URL
PDF

Abstract

ChatGPT has shown strong capabilities in natural language generation tasks, which naturally leads researchers to explore where its abilities end. In this paper, we examine whether ChatGPT can be used for zero-shot text classification, more specifically, automatic genre identification. We compare ChatGPT with a multilingual XLM-RoBERTa language model that was fine-tuned on datasets, manually annotated with genres. The models are compared on test sets in two languages: English and Slovenian. Results show that ChatGPT outperforms the fine-tuned model when applied to the dataset which was not seen before by either of the models. Even when applied on Slovenian language as an under-resourced language, ChatGPT's performance is no worse than when applied to English. However, if the model is fully prompted in Slovenian, the performance drops significantly, showing the current limitations of ChatGPT usage on smaller languages. The presented results lead us to questioning whether this is the beginning of an end of laborious manual annotation campaigns even for smaller languages, such as Slovenian.

Abstract (translated)

ChatGPT在自然语言生成任务中表现出强大的能力,这自然地促使研究人员探索它的能力边界。在本文中,我们探讨了ChatGPT是否可以用于零经验文本分类,更具体地说,自动分类。我们比较了ChatGPT和一个在数据集上手动标注了多种类型的多语言XLM-RoBERTa语言模型。模型在两个语言:英语和斯洛文尼亚的语言测试集上进行比较。结果表明,当应用于未在两种模型中观察到的dataset时,ChatGPT的性能比优化模型更好。即使应用斯洛文尼亚语言作为资源匮乏的语言,ChatGPT的性能也没有恶化到与英语的性能相同。然而,如果模型完全在斯洛文尼亚语中启用,性能会显著下降,这表明ChatGPT在小语言(如斯洛文尼亚)中使用目前的限制。 presented results 促使我们质疑,即使对于像斯洛文尼亚这样的小语言,手动标注 campaigns 也可能已经到了尽头。

URL

https://arxiv.org/abs/2303.03953

PDF

https://arxiv.org/pdf/2303.03953.pdf