Paper Reading AI Learner

Generating Faithful and Complete Hospital-Course Summaries from the Electronic Health Record

2024-04-01 15:47:21
Griffin Adams

Abstract

The rapid adoption of Electronic Health Records (EHRs) has been instrumental in streamlining administrative tasks, increasing transparency, and enabling continuity of care across providers. An unintended consequence of the increased documentation burden, however, has been reduced face-time with patients and, concomitantly, a dramatic rise in clinician burnout. In this thesis, we pinpoint a particularly time-intensive, yet critical, documentation task: generating a summary of a patient's hospital admissions, and propose and evaluate automated solutions. In Chapter 2, we construct a dataset based on 109,000 hospitalizations (2M source notes) and perform exploratory analyses to motivate future work on modeling and evaluation [NAACL 2021]. In Chapter 3, we address faithfulness from a modeling perspective by revising noisy references [EMNLP 2022] and, to reduce the reliance on references, directly calibrating model outputs to metrics [ACL 2023]. These works relied heavily on automatic metrics as human annotations were limited. To fill this gap, in Chapter 4, we conduct a fine-grained expert annotation of system errors in order to meta-evaluate existing metrics and better understand task-specific issues of domain adaptation and source-summary alignments. To learn a metric less correlated to extractiveness (copy-and-paste), we derive noisy faithfulness labels from an ensemble of existing metrics and train a faithfulness classifier on these pseudo labels [MLHC 2023]. Finally, in Chapter 5, we demonstrate that fine-tuned LLMs (Mistral and Zephyr) are highly prone to entity hallucinations and cover fewer salient entities. We improve both coverage and faithfulness by performing sentence-level entity planning based on a set of pre-computed salient entities from the source text, which extends our work on entity-guided news summarization [ACL, 2023], [EMNLP, 2023].

Abstract (translated)

快速采用电子病历(EHRs)确实有助于简化管理任务,提高透明度,并使护理服务的连续性得以实现。然而,增加的文档负担也减少了对患者的面对面接触,与此同时,医生们的工作负担也急剧增加。在这篇论文中,我们重点指出一个尤其耗时的、但关键的文档任务:生成患者医院的入院摘要。然后我们提出并评估了自动解决方案。在第二章,我们基于109,000次住院(2M源笔记)构建了一个数据集,并进行了探索性分析以激发未来关于建模和评估的工作[NAACL 2021]。在第三章,我们从建模角度解决了数据的不确定性,通过修改嘈杂的参考文献[EMNLP 2022],并直接将模型输出与指标进行校准以减少参考的依赖[ACL 2023]。这些工作在自动指标方面依赖很大,因为人类注释有限。为了填补这一空白,在第四章,我们进行了对系统错误的精细专家注释,以元评估现有指标,并更好地理解领域适应问题和源摘要对齐的任务特定问题。为了学习一个与提取性(复制和粘贴)无关的指标,我们从现有的指标集合中提取噪声标签,并训练一个信噪比分类器对这些伪标签[MLHC 2023]。最后,在第五章,我们证明了经过微调的LLMs(Mistral和Zephyr)容易产生实体错觉,并且覆盖的显著实体较少。通过基于源文本预计算的显著实体进行句子级别实体规划,我们提高了覆盖率和信噪比,这是对我们相关工作在实体指导新闻摘要[ACL, 2023],[EMNLP, 2023]的扩展。

URL

https://arxiv.org/abs/2404.01189

PDF

https://arxiv.org/pdf/2404.01189.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot