Abstract
One-on-one tutoring is widely acknowledged as an effective instructional method, conditioned on qualified tutors. However, the high demand for qualified tutors remains a challenge, often necessitating the training of novice tutors (i.e., trainees) to ensure effective tutoring. Research suggests that providing timely explanatory feedback can facilitate the training process for trainees. However, it presents challenges due to the time-consuming nature of assessing trainee performance by human experts. Inspired by the recent advancements of large language models (LLMs), our study employed the GPT-4 model to build an explanatory feedback system. This system identifies trainees' responses in binary form (i.e., correct/incorrect) and automatically provides template-based feedback with responses appropriately rephrased by the GPT-4 model. We conducted our study on 410 responses from trainees across three training lessons: Giving Effective Praise, Reacting to Errors, and Determining What Students Know. Our findings indicate that: 1) using a few-shot approach, the GPT-4 model effectively identifies correct/incorrect trainees' responses from three training lessons with an average F1 score of 0.84 and an AUC score of 0.85; and 2) using the few-shot approach, the GPT-4 model adeptly rephrases incorrect trainees' responses into desired responses, achieving performance comparable to that of human experts.
Abstract (translated)
一对一辅导被广泛认为是有效的教学方法,但合格的导师数量仍然是一个挑战,通常需要培训新手导师以确保有效的辅导。研究表明,及时的反馈解释可以促进学员的培训过程。然而,由于通过人类专家评估学员表现需要花费较长的时间,因此存在挑战。受到大型语言模型(LLMs)最近取得的进步的启发,我们的研究采用GPT-4模型构建了解释反馈系统。这个系统将学员的回答以二进制形式(即正确/错误)进行识别,并自动提供基于回答适当修改的模板为基础的反馈。我们在三个培训课程中的410名学员的回答上进行了研究:给予有效表扬,对错误作出反应,以及确定学生知道什么。我们的研究结果表明:1)使用少数样本方法,GPT-4模型能有效地从三个培训课程中识别出正确/错误的学员回答,平均F1得分达到0.84,AUC得分达到0.85;2)使用少数样本方法,GPT-4模型能巧妙地将错误的学员回答重新表述成期望的回答,达到人类专家的水平。
URL
https://arxiv.org/abs/2405.00970