Abstract
Electronic health records (EHR) even though a boon for healthcare practitioners, are growing convoluted and longer every day. Sifting around these lengthy EHRs is taxing and becomes a cumbersome part of physician-patient interaction. Several approaches have been proposed to help alleviate this prevalent issue either via summarization or sectioning, however, only a few approaches have truly been helpful in the past. With the rise of automated methods, machine learning (ML) has shown promise in solving the task of identifying relevant sections in EHR. However, most ML methods rely on labeled data which is difficult to get in healthcare. Large language models (LLMs) on the other hand, have performed impressive feats in natural language processing (NLP), that too in a zero-shot manner, i.e. without any labeled data. To that end, we propose using LLMs to identify relevant section headers. We find that GPT-4 can effectively solve the task on both zero and few-shot settings as well as segment dramatically better than state-of-the-art methods. Additionally, we also annotate a much harder real world dataset and find that GPT-4 struggles to perform well, alluding to further research and harder benchmarks.
Abstract (translated)
电子病历(EHR)虽然对医疗保健专业人员来说是一个福音,但它们每天都变得越来越复杂和冗长。在搜寻这些漫长的EHR时,这会使人疲惫不堪,成为医生和患者互动过程中的一个繁琐的部分。为了解决这个问题,提出了几种方法,包括摘要和分段,但只有少数方法真正有效。随着自动方法的兴起,机器学习(ML)在解决在EHR中识别相关节点的任务方面显示出前景。然而,大多数ML方法依赖于有标签的数据,而在医疗保健领域获得这些数据非常困难。大语言模型(LLMs)等其他方法在自然语言处理(NLP)方面也表现出色,而且在大规模数据集上表现出色,完全不需要任何有标签的数据。因此,我们提出使用LLMs来识别相关节点的建议。我们发现,GPT-4在零和少样本设置下都能够有效解决这个任务,而且性能比现有方法还要好。此外,我们还用更为困难的现实世界数据集进行了标注,发现GPT-4表现不佳,这表明需要进一步的研究和更为严苛的基准测试。
URL
https://arxiv.org/abs/2404.16294