Abstract
Simulated patient systems play a crucial role in modern medical education and research, providing safe, integrative learning environments and enabling clinical decision-making simulations. Large Language Models (LLM) could advance simulated patient systems by replicating medical conditions and patient-doctor interactions with high fidelity and low cost. However, ensuring the effectiveness and trustworthiness of these systems remains a challenge, as they require a large, diverse, and precise patient knowledgebase, along with a robust and stable knowledge diffusion to users. Here, we developed AIPatient, an advanced simulated patient system with AIPatient Knowledge Graph (AIPatient KG) as the input and the Reasoning Retrieval-Augmented Generation (Reasoning RAG) agentic workflow as the generation backbone. AIPatient KG samples data from Electronic Health Records (EHRs) in the Medical Information Mart for Intensive Care (MIMIC)-III database, producing a clinically diverse and relevant cohort of 1,495 patients with high knowledgebase validity (F1 0.89). Reasoning RAG leverages six LLM powered agents spanning tasks including retrieval, KG query generation, abstraction, checker, rewrite, and summarization. This agentic framework reaches an overall accuracy of 94.15% in EHR-based medical Question Answering (QA), outperforming benchmarks that use either no agent or only partial agent integration. Our system also presents high readability (median Flesch Reading Ease 77.23; median Flesch Kincaid Grade 5.6), robustness (ANOVA F-value 0.6126, p<0.1), and stability (ANOVA F-value 0.782, p<0.1). The promising performance of the AIPatient system highlights its potential to support a wide range of applications, including medical education, model evaluation, and system integration.
Abstract (translated)
模拟患者系统在现代医疗教育和研究中发挥着关键作用,为安全、集成化的学习环境提供了支持,并实现了临床决策模拟。大语言模型(LLM)通过高保真度和低成本复制医疗条件以及患者与医生互动,可能推动模拟患者系统的发展。然而,确保这些系统的有效性和可靠性仍然具有挑战性,因为它们需要一个大型、多样化和精确的患者知识库,以及一个健壮和稳定的知识扩散给用户。在这里,我们开发了AIPatient,一种先进的模拟患者系统,其AIPatient知识图(AIPatient KG)作为输入, Reasoning Retrieval-Augmented Generation(Reasoning RAG)代理工作流程作为生成骨架。AIPatient KG从医疗信息 Mart (MIMIC)-III 数据库中的电子病历中采样数据,产生了一个具有高知识库有效性的临床多样且相关的患者队列,知识库有效性(F1)为0.89。 Reasoning RAG利用六个LLM驱动的代理跨越任务,包括检索、KG查询生成、抽象、检查、重写和总结。这种代理框架在基于电子病历的医疗问答(QA)中的整体准确率为94.15%,超过了使用任何代理或仅部分代理的基准。我们的系统还具有高可读性(平均Flesch阅读难度77.23;平均Flesch-Kincaid级别5.6)、稳健性(ANOVA F值0.6126,p<0.1)和稳定性(ANOVA F值0.782,p<0.1)。AIPatient系统的卓越性能突出了它在支持广泛应用方面的潜力,包括医疗教育、模型评估和系统集成。
URL
https://arxiv.org/abs/2409.18924