Paper Reading AI Learner

Ruffle&Riley: Insights from Designing and Evaluating a Large Language Model-Based Conversational Tutoring System

2024-04-26 14:57:55
Robin Schmucker, Meng Xia, Amos Azaria, Tom Mitchell

Abstract

Conversational tutoring systems (CTSs) offer learning experiences through interactions based on natural language. They are recognized for promoting cognitive engagement and improving learning outcomes, especially in reasoning tasks. Nonetheless, the cost associated with authoring CTS content is a major obstacle to widespread adoption and to research on effective instructional design. In this paper, we discuss and evaluate a novel type of CTS that leverages recent advances in large language models (LLMs) in two ways: First, the system enables AI-assisted content authoring by inducing an easily editable tutoring script automatically from a lesson text. Second, the system automates the script orchestration in a learning-by-teaching format via two LLM-based agents (Ruffle&Riley) acting as a student and a professor. The system allows for free-form conversations that follow the ITS-typical inner and outer loop structure. We evaluate Ruffle&Riley's ability to support biology lessons in two between-subject online user studies (N = 200) comparing the system to simpler QA chatbots and reading activity. Analyzing system usage patterns, pre/post-test scores and user experience surveys, we find that Ruffle&Riley users report high levels of engagement, understanding and perceive the offered support as helpful. Even though Ruffle&Riley users require more time to complete the activity, we did not find significant differences in short-term learning gains over the reading activity. Our system architecture and user study provide various insights for designers of future CTSs. We further open-source our system to support ongoing research on effective instructional design of LLM-based learning technologies.

Abstract (translated)

谈话辅助系统(CTS)通过自然语言交互提供学习体验。它们因促进认知参与和改进学习成果而受到认可,特别是在推理任务中。然而,为创建CTS内容而付出的成本是推广和有效教学设计研究的一个主要障碍。在本文中,我们讨论并评估了一种新型的CTS,它通过两种方式利用了大型语言模型(LLMs)的最近进展:首先,系统通过从课文诱导易于编辑的指导脚本来自动化AI辅助内容创作。其次,系统通过两个LLM代理(Ruffle&Riley)作为学生和教授自动编排脚本,实现了学习-以教模式。系统允许进行自由格式对话,遵循ITS典型的内循环和外循环结构。我们通过在两个同时进行的在线用户研究(N=200)比较系统与简单的问答聊天机器人和阅读活动来评估Ruffle&Riley的生物学课程支持能力。通过分析系统使用模式、前/后测试分数和用户体验调查,我们发现Ruffle&Riley用户报告了很高的参与度、理解和认为提供的支持很有帮助。尽管Ruffle&Riley用户需要更多时间来完成活动,但我们没有在阅读活动中发现短期的学习增长差异。我们的系统架构和用户研究为未来CTS的设计提供了各种洞见。我们进一步开源我们的系统,以支持关于LLM基于学习技术的有效教学设计的研究。

URL

https://arxiv.org/abs/2404.17460

PDF

https://arxiv.org/pdf/2404.17460.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot