Paper Reading AI Learner

Evaluate underdiagnosis and overdiagnosis bias of deep learning model on primary open-angle glaucoma diagnosis in under-served patient populations

2023-01-26 18:53:09
Mingquan Lin, Yuyun Xiao, Bojian Hou, Tingyi Wanyan, Mohit Manoj Sharma, Zhangyang Wang, Fei Wang, Sarah Van Tassel, Yifan Peng

Abstract

In the United States, primary open-angle glaucoma (POAG) is the leading cause of blindness, especially among African American and Hispanic individuals. Deep learning has been widely used to detect POAG using fundus images as its performance is comparable to or even surpasses diagnosis by clinicians. However, human bias in clinical diagnosis may be reflected and amplified in the widely-used deep learning models, thus impacting their performance. Biases may cause (1) underdiagnosis, increasing the risks of delayed or inadequate treatment, and (2) overdiagnosis, which may increase individuals' stress, fear, well-being, and unnecessary/costly treatment. In this study, we examined the underdiagnosis and overdiagnosis when applying deep learning in POAG detection based on the Ocular Hypertension Treatment Study (OHTS) from 22 centers across 16 states in the United States. Our results show that the widely-used deep learning model can underdiagnose or overdiagnose underserved populations. The most underdiagnosed group is female younger (< 60 yrs) group, and the most overdiagnosed group is Black older (>=60 yrs) group. Biased diagnosis through traditional deep learning methods may delay disease detection, treatment and create burdens among under-served populations, thereby, raising ethical concerns about using deep learning models in ophthalmology clinics.

Abstract (translated)

在美国, primary open-angle glaucoma (POAG)是导致失明的主要原因,特别是在非裔美国人和西班牙裔美国人中。深度学习已经被广泛应用于利用 fundus图像检测 POAG,因为其表现可以与甚至超过临床医生的诊断水平。然而,临床诊断中的人类偏见可能会反映和放大在广泛使用的深度学习模型中,从而影响其表现。偏见可能导致(1) under诊断,增加延迟或不足治疗的风险,(2) over诊断,增加个人的压力、恐惧、健康和不必要的/昂贵的治疗。在本研究中,我们研究了在基于美国16个州22个中心的Ocular Hypertension Treatment Study(OHTS)的 POAG检测中应用深度学习时 under诊断和 over诊断的情况。我们的结果显示,广泛使用的深度学习模型可能 under诊断或 over诊断未被满足的人群。最 under诊断 的群体是女性年龄小于60岁组,最 over诊断的群体是黑人年龄大于60岁组。传统的深度学习方法中的偏见可能导致疾病检测、治疗和在欠服务群体中造成负担,从而提出了在眼科诊所使用深度学习模型的伦理问题。

URL

https://arxiv.org/abs/2301.11315

PDF

https://arxiv.org/pdf/2301.11315.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot