Abstract
Linguistic ambiguity continues to represent a significant challenge for natural language processing (NLP) systems, notwithstanding the advancements in architectures such as Transformers and BERT. Inspired by the recent success of instructional models like ChatGPT and Gemini (In 2023, the artificial intelligence was called Bard.), this study aims to analyze and discuss linguistic ambiguity within these models, focusing on three types prevalent in Brazilian Portuguese: semantic, syntactic, and lexical ambiguity. We create a corpus comprising 120 sentences, both ambiguous and unambiguous, for classification, explanation, and disambiguation. The models capability to generate ambiguous sentences was also explored by soliciting sets of sentences for each type of ambiguity. The results underwent qualitative analysis, drawing on recognized linguistic references, and quantitative assessment based on the accuracy of the responses obtained. It was evidenced that even the most sophisticated models, such as ChatGPT and Gemini, exhibit errors and deficiencies in their responses, with explanations often providing inconsistent. Furthermore, the accuracy peaked at 49.58 percent, indicating the need for descriptive studies for supervised learning.
Abstract (translated)
语言歧义一直是自然语言处理(NLP)系统的一个显著挑战,尽管像Transformer和BERT这样的架构取得了进步。受到类似ChatGPT和Gemini等 recent instructional models的成功启发,本研究旨在分析并讨论这些模型中的语言歧义,重点关注巴西葡萄牙语中三种普遍存在的歧义类型:语义、句法 和词汇歧义。我们创建了一个包括120个句子的语料库,包括歧义和明确语义两种,用于分类、解释和去歧义。还研究了模型生成歧义句的能力,通过要求针对每种歧义类型提供一组句子。结果经过定性分析,基于公认的语言参考,以及基于所获回答的准确性的定量评估。结果显示,即使是最先进的模型,如ChatGPT和Gemini,在其回应中也有错误和不足之处,解释往往是不一致的。此外,准确率在49.58%达到峰值,表明需要进行描述性研究来进行有监督学习。
URL
https://arxiv.org/abs/2404.16653