Abstract
Recent advances in natural language processing (NLP) may enable artificial intelligence (AI) models to generate writing that is identical to human written form in the future. This might have profound ethical, legal, and social repercussions. This study aims to address this problem by offering an accurate AI detector model that can differentiate between electronically produced text and human-written text. Our approach includes machine learning methods such as XGB Classifier, SVM, BERT architecture deep learning models. Furthermore, our results show that the BERT performs better than previous models in identifying information generated by AI from information provided by humans. Provide a comprehensive analysis of the current state of AI-generated text identification in our assessment of pertinent studies. Our testing yielded positive findings, showing that our strategy is successful, with the BERT emerging as the most probable answer. We analyze the research's societal implications, highlighting the possible advantages for various industries while addressing sustainability issues pertaining to morality and the environment. The XGB classifier and SVM give 0.84 and 0.81 accuracy in this article, respectively. The greatest accuracy in this research is provided by the BERT model, which provides 0.93% accuracy.
Abstract (translated)
近年来自然语言处理(NLP)的进步可能使人工智能(AI)模型在未来的某个时候能够生成与人类书面形式相同的写作。这可能对伦理、法律和社会产生深远的影响。本研究旨在通过提供一个准确的AI检测器模型来解决这个问题,该模型可以区分电子文本和人类撰写的文本。我们的方法包括机器学习方法,如XGB分类器、SVM和BERT架构的深度学习模型。此外,我们的结果表明,BERT在识别人工智能从人类提供的信息生成的信息方面比以前的表现更好。在我们对相关研究的评估中,提供了关于AI生成的文本识别的当前状态的全面分析。我们的测试得到了积极的结果,表明我们的策略是成功的,BERT成为了最有可能的答案。我们分析了研究的社会影响,强调了各种行业在道德和环境方面的潜在优势,并解决相关可持续性问题。XGB分类器和SVM分别为0.84和0.81的准确度。本研究中的最高准确度是由BERT模型提供的0.93%的准确度。
URL
https://arxiv.org/abs/2404.10032