Abstract
Conversational tutoring systems (CTSs) offer learning experiences through interactions based on natural language. They are recognized for promoting cognitive engagement and improving learning outcomes, especially in reasoning tasks. Nonetheless, the cost associated with authoring CTS content is a major obstacle to widespread adoption and to research on effective instructional design. In this paper, we discuss and evaluate a novel type of CTS that leverages recent advances in large language models (LLMs) in two ways: First, the system enables AI-assisted content authoring by inducing an easily editable tutoring script automatically from a lesson text. Second, the system automates the script orchestration in a learning-by-teaching format via two LLM-based agents (Ruffle&Riley) acting as a student and a professor. The system allows for free-form conversations that follow the ITS-typical inner and outer loop structure. We evaluate Ruffle&Riley's ability to support biology lessons in two between-subject online user studies (N = 200) comparing the system to simpler QA chatbots and reading activity. Analyzing system usage patterns, pre/post-test scores and user experience surveys, we find that Ruffle&Riley users report high levels of engagement, understanding and perceive the offered support as helpful. Even though Ruffle&Riley users require more time to complete the activity, we did not find significant differences in short-term learning gains over the reading activity. Our system architecture and user study provide various insights for designers of future CTSs. We further open-source our system to support ongoing research on effective instructional design of LLM-based learning technologies.
Abstract (translated)
谈话辅助系统(CTS)通过自然语言交互提供学习体验。它们因促进认知参与和改进学习成果而受到认可,特别是在推理任务中。然而,为创建CTS内容而付出的成本是推广和有效教学设计研究的一个主要障碍。在本文中,我们讨论并评估了一种新型的CTS,它通过两种方式利用了大型语言模型(LLMs)的最近进展:首先,系统通过从课文诱导易于编辑的指导脚本来自动化AI辅助内容创作。其次,系统通过两个LLM代理(Ruffle&Riley)作为学生和教授自动编排脚本,实现了学习-以教模式。系统允许进行自由格式对话,遵循ITS典型的内循环和外循环结构。我们通过在两个同时进行的在线用户研究(N=200)比较系统与简单的问答聊天机器人和阅读活动来评估Ruffle&Riley的生物学课程支持能力。通过分析系统使用模式、前/后测试分数和用户体验调查,我们发现Ruffle&Riley用户报告了很高的参与度、理解和认为提供的支持很有帮助。尽管Ruffle&Riley用户需要更多时间来完成活动,但我们没有在阅读活动中发现短期的学习增长差异。我们的系统架构和用户研究为未来CTS的设计提供了各种洞见。我们进一步开源我们的系统,以支持关于LLM基于学习技术的有效教学设计的研究。
URL
https://arxiv.org/abs/2404.17460