Paper Reading AI Learner

Towards Automatic Boundary Detection for Human-AI Hybrid Essay in Education

2023-07-23 08:47:51
Zijie Zeng, Lele Sha, Yuheng Li, Kaixun Yang, Dragan Gašević, Guanliang Chen

Abstract

Human-AI collaborative writing has been greatly facilitated with the help of modern large language models (LLM), e.g., ChatGPT. While admitting the convenience brought by technology advancement, educators also have concerns that students might leverage LLM to partially complete their writing assignment and pass off the human-AI hybrid text as their original work. Driven by such concerns, in this study, we investigated the automatic detection of Human-AI hybrid text in education, where we formalized the hybrid text detection as a boundary detection problem, i.e., identifying the transition points between human-written content and AI-generated content. We constructed a hybrid essay dataset by partially removing sentences from the original student-written essays and then instructing ChatGPT to fill in for the incomplete essays. Then we proposed a two-step detection approach where we (1) Separated AI-generated content from human-written content during the embedding learning process; and (2) Calculated the distances between every two adjacent prototypes (a prototype is the mean of a set of consecutive sentences from the hybrid text in the embedding space) and assumed that the boundaries exist between the two prototypes that have the furthest distance from each other. Through extensive experiments, we summarized the following main findings: (1) The proposed approach consistently outperformed the baseline methods across different experiment settings; (2) The embedding learning process (i.e., step 1) can significantly boost the performance of the proposed approach; (3) When detecting boundaries for single-boundary hybrid essays, the performance of the proposed approach could be enhanced by adopting a relatively large prototype size, leading to a $22$\% improvement (against the second-best baseline method) in the in-domain setting and an $18$\% improvement in the out-of-domain setting.

Abstract (translated)

借助现代大型语言模型(LLM),如ChatGPT,人类-AI合作写作已经极大地便利了。尽管承认技术进步带来的便利,教育工作者也感到担忧,学生可能会利用LLM部分完成他们的写作任务,并将人类-AI混合文本当作自己的工作。基于这些担忧,在本研究中,我们探讨了在教育中自动检测人类-AI混合文本的问题,我们将混合文本检测 formalized 为边界检测问题,即确定人类文本和AI生成文本之间的过渡点。我们通过partially 删除原始学生写作的语句,构造了混合作文数据集,并指令ChatGPT为不完整的作文填充。然后,我们提出了一种两步检测方法,其中我们在嵌入学习过程中(1)将AI生成的内容与人类写的内容分离;(2)计算每个相邻原型之间的距离(原型是混合文本在嵌入空间中连续语句的均值)并假设有两个距离最长的原型之间存在边界。通过广泛的实验,我们总结了以下主要发现:(1) proposed 方法在不同实验设置中 consistently outperforms the baseline methods;(2) 嵌入学习过程(即步骤1)可以显著增强 proposed 方法的性能;(3) 在检测单一边界混合作文的边界时,采用相对较大原型大小可以提高 proposed 方法的性能,导致在域内设置中比最佳 baseline method 提高了22%,而在跨域设置中提高了18%。

URL

https://arxiv.org/abs/2307.12267

PDF

https://arxiv.org/pdf/2307.12267.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot