Abstract
Deep learning based image enhancement models have largely improved the readability of fundus images in order to decrease the uncertainty of clinical observations and the risk of misdiagnosis. However, due to the difficulty of acquiring paired real fundus images at different qualities, most existing methods have to adopt synthetic image pairs as training data. The domain shift between the synthetic and the real images inevitably hinders the generalization of such models on clinical data. In this work, we propose an end-to-end optimized teacher-student framework to simultaneously conduct image enhancement and domain adaptation. The student network uses synthetic pairs for supervised enhancement, and regularizes the enhancement model to reduce domain-shift by enforcing teacher-student prediction consistency on the real fundus images without relying on enhanced ground-truth. Moreover, we also propose a novel multi-stage multi-attention guided enhancement network (MAGE-Net) as the backbones of our teacher and student network. Our MAGE-Net utilizes multi-stage enhancement module and retinal structure preservation module to progressively integrate the multi-scale features and simultaneously preserve the retinal structures for better fundus image quality enhancement. Comprehensive experiments on both real and synthetic datasets demonstrate that our framework outperforms the baseline approaches. Moreover, our method also benefits the downstream clinical tasks.
Abstract (translated)
深度学习为基础的图像增强模型已经在很大程度上改善了 fundus图像的阅读性,以减少临床观察的不确定性和误诊风险。然而,由于获取不同品质 pair 的真实 fundus 图像的困难,大多数现有方法必须采用合成图像对作为训练数据。合成和真实图像之间的域转换不可避免地阻碍了这种模型在临床数据上的泛化。在本工作时,我们提出了一种端到端优化的学生和老师框架,以同时进行图像增强和域适应。学生网络使用合成图像对进行监督增强,并 regularize 增强模型以减少域转换,通过在没有增强的真实图像上进行学生预测一致性的情况下强制一致性,以避免域转换。此外,我们还提出了一种 novel 的多级多注意力引导增强网络(MAGE-Net),作为我们的学生和老师网络的骨架。我们的 MAGE-Net 使用多级增强器和 retinal 结构保留模块,逐步集成多尺度特征,同时保护 retinal 结构,以改善真实 fundus 图像质量的增强。在真实和合成数据集上的全面实验表明,我们的框架优于基准方法。此外,我们的方法还有助于后续临床任务。
URL
https://arxiv.org/abs/2302.11795