Paper Reading AI Learner

SSVT: Self-Supervised Vision Transformer For Eye Disease Diagnosis Based On Fundus Images

2024-04-20 14:06:04
Jiaqi Wang, Mengtian Kang, Yong Liu, Chi Zhang, Ying Liu, Shiming Li, Yue Qi, Wenjun Xu, Chenyu Tang, Edoardo Occhipinti, Mayinuer Yusufu, Ningli Wang, Weiling Bai, Shuo Gao, Luigi G. Occhipinti

Abstract

Machine learning-based fundus image diagnosis technologies trigger worldwide interest owing to their benefits such as reducing medical resource power and providing objective evaluation results. However, current methods are commonly based on supervised methods, bringing in a heavy workload to biomedical staff and hence suffering in expanding effective databases. To address this issue, in this article, we established a label-free method, name 'SSVT',which can automatically analyze un-labeled fundus images and generate high evaluation accuracy of 97.0% of four main eye diseases based on six public datasets and two datasets collected by Beijing Tongren Hospital. The promising results showcased the effectiveness of the proposed unsupervised learning method, and the strong application potential in biomedical resource shortage regions to improve global eye health.

Abstract (translated)

基于机器学习的 fundus 图像诊断技术因其节省医疗资源、提供客观评估结果等好处而引起了全球关注。然而,现有的方法通常基于监督方法,给生物医学工作人员带来繁重的工作负担,导致难以扩展有效的数据库。为了应对这个问题,本文我们建立了一种名为 'SSVT' 的无标签方法,通过自动分析未标注的 fundus 图像并基于六个公开数据集和北京同仁医院的两个数据集,可以生成四种主要眼病97.0%的高评估精度。这种无标签学习方法展示出所提出的自监督学习方法的效力,以及在生物资源短缺地区改善全球眼健康具有强大的应用潜力。

URL

https://arxiv.org/abs/2404.13386

PDF

https://arxiv.org/pdf/2404.13386.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot