Abstract
We explore Generalizable Tumor Segmentation, aiming to train a single model for zero-shot tumor segmentation across diverse anatomical regions. Existing methods face limitations related to segmentation quality, scalability, and the range of applicable imaging modalities. In this paper, we uncover the potential of the internal representations within frozen medical foundation diffusion models as highly efficient zero-shot learners for tumor segmentation by introducing a novel framework named DiffuGTS. DiffuGTS creates anomaly-aware open-vocabulary attention maps based on text prompts to enable generalizable anomaly segmentation without being restricted by a predefined training category list. To further improve and refine anomaly segmentation masks, DiffuGTS leverages the diffusion model, transforming pathological regions into high-quality pseudo-healthy counterparts through latent space inpainting, and applies a novel pixel-level and feature-level residual learning approach, resulting in segmentation masks with significantly enhanced quality and generalization. Comprehensive experiments on four datasets and seven tumor categories demonstrate the superior performance of our method, surpassing current state-of-the-art models across multiple zero-shot settings. Codes are available at this https URL.
Abstract (translated)
我们研究了一种通用肿瘤分割方法,目标是训练一个单一的模型以实现跨不同解剖区域的零样本肿瘤分割。现有方法在分割质量、可扩展性和适用成像模态范围方面存在局限性。在这篇论文中,我们通过引入一种名为DiffuGTS的新框架,揭示了冻结医学基础扩散模型内部表示作为高效零样本学习器进行肿瘤分割的巨大潜力。DiffuGTS基于文本提示创建异常感知开放词汇注意力图,使通用的异常分割不再受预定义训练类别列表的限制。 为了进一步提高和细化异常分割掩码的质量,DiffuGTS利用了扩散模型,将病理区域通过潜在空间修复转换为高质量伪健康对应的区域,并应用了一种新颖的像素级和特征级残差学习方法。这使得生成的分割掩码质量显著提升且具有更强的泛化能力。 我们在四个数据集和七种类别肿瘤上的全面实验展示了我们方法在多个零样本设置中的优越性能,超越了当前最先进的模型。代码可以在提供的链接中获得。
URL
https://arxiv.org/abs/2505.02753