Paper Reading AI Learner

Data Set Terminology of Artificial Intelligence in Medicine: A Historical Review and Recommendation

2024-04-30 07:07:45
Shannon L. Walston, Hiroshi Seki, Hirotaka Takita, Yasuhito Mitsuyama, Shingo Sato, Akifumi Hagiwara, Rintaro Ito, Shouhei Hanaoka, Yukio Miki, Daiju Ueda

Abstract

Medicine and artificial intelligence (AI) engineering represent two distinct fields each with decades of published history. With such history comes a set of terminology that has a specific way in which it is applied. However, when two distinct fields with overlapping terminology start to collaborate, miscommunication and misunderstandings can occur. This narrative review aims to give historical context for these terms, accentuate the importance of clarity when these terms are used in medical AI contexts, and offer solutions to mitigate misunderstandings by readers from either field. Through an examination of historical documents, including articles, writing guidelines, and textbooks, this review traces the divergent evolution of terms for data sets and their impact. Initially, the discordant interpretations of the word 'validation' in medical and AI contexts are explored. Then the data sets used for AI evaluation are classified, namely random splitting, cross-validation, temporal, geographic, internal, and external sets. The accurate and standardized description of these data sets is crucial for demonstrating the robustness and generalizability of AI applications in medicine. This review clarifies existing literature to provide a comprehensive understanding of these classifications and their implications in AI evaluation. This review then identifies often misunderstood terms and proposes pragmatic solutions to mitigate terminological confusion. Among these solutions are the use of standardized terminology such as 'training set,' 'validation (or tuning) set,' and 'test set,' and explicit definition of data set splitting terminologies in each medical AI research publication. This review aspires to enhance the precision of communication in medical AI, thereby fostering more effective and transparent research methodologies in this interdisciplinary field.

Abstract (translated)

医学和人工智能(AI)工程分别具有数十年历史的两个独立领域。随着这种历史,会用到一系列特定的术语,这些术语有一种特定的应用方式。然而,当两个具有重叠术语的独立领域开始合作时,可能会发生误解和误解。本综述旨在为这些术语提供历史背景,强调在医学AI环境中使用这些术语时的重要性,并提供了解决读者来自各自领域误解的方法。通过研究历史文件,包括文章、写作指南和教科书,本综述探讨了数据集的演变及其影响。首先,探索了医学和AI环境中“验证”一词的异解。接着对AI评估所使用的数据集进行分类,即随机划分、交叉验证、时间、地理、内部和外部数据集。准确和标准化的描述这些数据集对展示AI应用程序在医学方面的稳健性和一般性至关重要。本综述澄清了现有文献,以提供对这类分类的全面了解及其在AI评估中的影响。接着,识别出常常被误解的术语,并提出了实用的解决方案以减轻术语混淆。这些解决方案包括使用标准化术语(如“训练集”、“验证(或调整)集”和“测试集”),以及在医疗AI研究出版物中明确定义数据集分割术语。本综述希望提高在医学AI领域的沟通精确度,从而促进跨学科领域更有效和透明的研究方法。

URL

https://arxiv.org/abs/2404.19303

PDF

https://arxiv.org/pdf/2404.19303.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot