Paper Reading AI Learner

Evaluating the Application of ChatGPT in Outpatient Triage Guidance: A Comparative Study

2024-04-27 04:12:02
Dou Liu, Ying Han, Xiandi Wang, Xiaomei Tan, Di Liu, Guangwu Qian, Kang Li, Dan Pu, Rong Yin

Abstract

The integration of Artificial Intelligence (AI) in healthcare presents a transformative potential for enhancing operational efficiency and health outcomes. Large Language Models (LLMs), such as ChatGPT, have shown their capabilities in supporting medical decision-making. Embedding LLMs in medical systems is becoming a promising trend in healthcare development. The potential of ChatGPT to address the triage problem in emergency departments has been examined, while few studies have explored its application in outpatient departments. With a focus on streamlining workflows and enhancing efficiency for outpatient triage, this study specifically aims to evaluate the consistency of responses provided by ChatGPT in outpatient guidance, including both within-version response analysis and between-version comparisons. For within-version, the results indicate that the internal response consistency for ChatGPT-4.0 is significantly higher than ChatGPT-3.5 (p=0.03) and both have a moderate consistency (71.2% for 4.0 and 59.6% for 3.5) in their top recommendation. However, the between-version consistency is relatively low (mean consistency score=1.43/3, median=1), indicating few recommendations match between the two versions. Also, only 50% top recommendations match perfectly in the comparisons. Interestingly, ChatGPT-3.5 responses are more likely to be complete than those from ChatGPT-4.0 (p=0.02), suggesting possible differences in information processing and response generation between the two versions. The findings offer insights into AI-assisted outpatient operations, while also facilitating the exploration of potentials and limitations of LLMs in healthcare utilization. Future research may focus on carefully optimizing LLMs and AI integration in healthcare systems based on ergonomic and human factors principles, precisely aligning with the specific needs of effective outpatient triage.

Abstract (translated)

人工智能(AI)在医疗领域的应用具有提高运营效率和健康状况的变革潜力。大型语言模型(LLMs)如ChatGPT,已经在支持医疗决策方面展现出其能力。将LLMs嵌入医疗系统已成为医疗发展中的一个有前景的趋势。本文重点探讨了ChatGPT在急诊科分诊方面的应用潜力,而很少有研究探讨其在门诊部的应用。本文旨在评估ChatGPT在门诊指导中的回答一致性,包括内翻响应分析和跨版本比较。在內翻响应方面,结果显示ChatGPT-4.0的内部响应一致性显著高于ChatGPT-3.5(p=0.03),且两者在最高建议方面具有相似的稳健性(40.8% for 4.0 and 59.6% for 3.5)。然而,跨版本一致性相对较低(平均一致性分数=1.43/3,中位数=1),表明两个版本之间很少匹配。此外,只有50%的顶级建议在比较中完全匹配。有趣的是,ChatGPT-3.5的回答更有可能完整性较高(p=0.02),这可能表明两个版本之间在信息处理和响应生成方面的差异。这些发现为人工智能辅助医疗操作提供了洞见,同时也为探讨LLM在医疗利用中的潜力和局限性提供了便利。未来的研究可能会根据人机工程和人类因素原则,仔细优化LLMs和AI在医疗系统中的应用,并精确地把握有效门诊分诊的具体需求。

URL

https://arxiv.org/abs/2405.00728

PDF

https://arxiv.org/pdf/2405.00728.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot