Paper Reading AI Learner

Towards Efficient Patient Recruitment for Clinical Trials: Application of a Prompt-Based Learning Model

2024-04-24 20:42:28
Mojdeh Rahmanian, Seyed Mostafa Fakhrahmad, Seyedeh Zahra Mousavi

Abstract

Objective: Clinical trials are essential for advancing pharmaceutical interventions, but they face a bottleneck in selecting eligible participants. Although leveraging electronic health records (EHR) for recruitment has gained popularity, the complex nature of unstructured medical texts presents challenges in efficiently identifying participants. Natural Language Processing (NLP) techniques have emerged as a solution with a recent focus on transformer models. In this study, we aimed to evaluate the performance of a prompt-based large language model for the cohort selection task from unstructured medical notes collected in the EHR. Methods: To process the medical records, we selected the most related sentences of the records to the eligibility criteria needed for the trial. The SNOMED CT concepts related to each eligibility criterion were collected. Medical records were also annotated with MedCAT based on the SNOMED CT ontology. Annotated sentences including concepts matched with the criteria-relevant terms were extracted. A prompt-based large language model (Generative Pre-trained Transformer (GPT) in this study) was then used with the extracted sentences as the training set. To assess its effectiveness, we evaluated the model's performance using the dataset from the 2018 n2c2 challenge, which aimed to classify medical records of 311 patients based on 13 eligibility criteria through NLP techniques. Results: Our proposed model showed the overall micro and macro F measures of 0.9061 and 0.8060 which were among the highest scores achieved by the experiments performed with this dataset. Conclusion: The application of a prompt-based large language model in this study to classify patients based on eligibility criteria received promising scores. Besides, we proposed a method of extractive summarization with the aid of SNOMED CT ontology that can be also applied to other medical texts.

Abstract (translated)

目标:临床试验对于推动制药干预至关重要,但在选择合适参与者方面存在瓶颈。尽管利用电子病历(EHR)进行招募的做法已经受到欢迎,但非结构化医疗文本复杂的 nature 提出了有效地识别参与者的挑战。自然语言处理(NLP)技术在最近关注于Transformer模型方面成为了解决方案。在这项研究中,我们旨在评估基于提示的大型语言模型在从EHR中收集的非结构化医疗文本的队列选择任务中的性能。方法:为了处理医学记录,我们选择了与需要试验资格标准相关的最相关的句子。收集了与每个资格标准相关的SNOMED CT概念。同时,根据SNOMED CT语义数据库对医学记录进行了注释。包括与标准匹配的概念的注解句子被提取出来。然后,使用基于提示的大型语言模型(本研究中使用的是Generative Pre-trained Transformer(GPT))对提取的句子进行训练。为了评估其效果,我们使用2018 n2c2挑战的数据集来评估模型的性能,该数据集旨在根据13个资格标准对311名患者的医疗记录进行分类。结果:与该数据集上进行的实验相比,我们提出的模型在整体微和宏观F分数方面得分最高,为0.9061和0.8060,这是该数据集中实现的最高分数。结论:将提示式大型语言模型应用于根据资格标准对患者进行分类,在本研究中得到了有前景的分数。此外,我们还提出了使用SNOMED CT语义数据库的提取式总结方法,该方法也可以应用于其他医学文本。

URL

https://arxiv.org/abs/2404.16198

PDF

https://arxiv.org/pdf/2404.16198.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot