Paper Reading AI Learner

MedReadMe: A Systematic Study for Fine-grained Sentence Readability in Medical Domain

2024-05-03 14:48:20
Chao Jiang, Wei Xu

Abstract

Medical texts are notoriously challenging to read. Properly measuring their readability is the first step towards making them more accessible. In this paper, we present a systematic study on fine-grained readability measurements in the medical domain at both sentence-level and span-level. We introduce a new dataset MedReadMe, which consists of manually annotated readability ratings and fine-grained complex span annotation for 4,520 sentences, featuring two novel "Google-Easy" and "Google-Hard" categories. It supports our quantitative analysis, which covers 650 linguistic features and automatic complex word and jargon identification. Enabled by our high-quality annotation, we benchmark and improve several state-of-the-art sentence-level readability metrics for the medical domain specifically, which include unsupervised, supervised, and prompting-based methods using recently developed large language models (LLMs). Informed by our fine-grained complex span annotation, we find that adding a single feature, capturing the number of jargon spans, into existing readability formulas can significantly improve their correlation with human judgments. We will publicly release the dataset and code.

Abstract (translated)

医疗文本通常很难阅读。正确测量其可读性是使其更易访问的第一步。在本文中,我们在句子级别和跨度级别对医疗领域的细粒度可读性测量进行了一项系统性的研究。我们引入了一个名为MedReadMe的新数据集,其中包括4,520个句子,每个句子都由人工标注的读性评分和细粒度复杂跨度注释。它涵盖了两个新的“Google-Easy”和“Google-Hard”类别。它支持我们定量的分析,涵盖了650个语言特征和自动识别复杂词汇。得益于我们高质量的注释,我们基准和提高了多个针对医疗领域的句子级别可读性指标,这些指标包括基于最近发展的大语言模型(LLMs)的无监督、有监督和提示方法。凭借我们细粒度复杂跨度注释,我们发现,向现有的可读性公式中添加一个功能,即捕捉到词汇跨度数量,可以显著提高它们与人类判断的相关性。我们将公开发布这个数据集和代码。

URL

https://arxiv.org/abs/2405.02144

PDF

https://arxiv.org/pdf/2405.02144.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot