Paper Reading AI Learner

VR-FuseNet: A Fusion of Heterogeneous Fundus Data and Explainable Deep Network for Diabetic Retinopathy Classification

2025-04-30 09:38:47
Shamim Rahim Refat, Ziyan Shirin Raha, Shuvashis Sarker, Faika Fairuj Preotee, MD. Musfikur Rahman, Tashreef Muhammad, Mohammad Shafiul Islam

Abstract

Diabetic retinopathy is a severe eye condition caused by diabetes where the retinal blood vessels get damaged and can lead to vision loss and blindness if not treated. Early and accurate detection is key to intervention and stopping the disease progressing. For addressing this disease properly, this paper presents a comprehensive approach for automated diabetic retinopathy detection by proposing a new hybrid deep learning model called VR-FuseNet. Diabetic retinopathy is a major eye disease and leading cause of blindness especially among diabetic patients so accurate and efficient automated detection methods are required. To address the limitations of existing methods including dataset imbalance, diversity and generalization issues this paper presents a hybrid dataset created from five publicly available diabetic retinopathy datasets. Essential preprocessing techniques such as SMOTE for class balancing and CLAHE for image enhancement are applied systematically to the dataset to improve the robustness and generalizability of the dataset. The proposed VR-FuseNet model combines the strengths of two state-of-the-art convolutional neural networks, VGG19 which captures fine-grained spatial features and ResNet50V2 which is known for its deep hierarchical feature extraction. This fusion improves the diagnostic performance and achieves an accuracy of 91.824%. The model outperforms individual architectures on all performance metrics demonstrating the effectiveness of hybrid feature extraction in Diabetic Retinopathy classification tasks. To make the proposed model more clinically useful and interpretable this paper incorporates multiple XAI techniques. These techniques generate visual explanations that clearly indicate the retinal features affecting the model's prediction such as microaneurysms, hemorrhages and exudates so that clinicians can interpret and validate.

Abstract (translated)

糖尿病视网膜病变是一种由糖尿病引起的严重眼部疾病,其中视网膜的血管受损,如果不进行治疗可能会导致视力丧失甚至失明。早期和准确地检测到这种疾病是干预并阻止其进展的关键。为了有效应对这一病症,本文提出了一种全新的混合深度学习模型VR-FuseNet,并提出了一个全面的自动化糖尿病视网膜病变检测方法。 糖尿病视网膜病变是一种主要的眼部疾病,也是导致糖尿病患者失明的主要原因,因此需要准确且高效的自动检测方法。为了解决现有方法中存在的数据集不平衡、多样性和泛化性问题,本文提出了一种由五个公开可用的数据集组成的混合数据集。通过系统地应用诸如SMOTE(用于类别平衡)和CLAHE(用于图像增强)等重要的预处理技术来改进数据集的鲁棒性和普适性。 提出的VR-FuseNet模型结合了两个最先进的卷积神经网络VGG19和ResNet50V2的优势,前者能够捕捉细微的空间特征,而后者则以其深度层次化的特征提取能力著称。这种融合不仅提高了诊断性能,还达到了91.824%的准确率,在所有性能指标上都优于单一架构模型,证明了混合特征提取在糖尿病视网膜病变分类任务中的有效性。 为了使所提出的模型更加临床实用和易于解释,本文结合了几种XAI(可解释的人工智能)技术。这些方法生成的视觉解释能够清晰地显示影响模型预测结果的眼底特征,如微动脉瘤、出血点和渗出物等,从而帮助临床医生进行理解和验证。

URL

https://arxiv.org/abs/2504.21464

PDF

https://arxiv.org/pdf/2504.21464.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot