Abstract
Lung-infected area segmentation is crucial for assessing the severity of lung diseases. However, existing image-text multi-modal methods typically rely on labour-intensive annotations for model training, posing challenges regarding time and expertise. To address this issue, we propose a novel attribute knowledge-guided framework for unsupervised lung-infected area segmentation (AKGNet), which achieves segmentation solely based on image-text data without any mask annotation. AKGNet facilitates text attribute knowledge learning, attribute-image cross-attention fusion, and high-confidence-based pseudo-label exploration simultaneously. It can learn statistical information and capture spatial correlations between image and text attributes in the embedding space, iteratively refining the mask to enhance segmentation. Specifically, we introduce a text attribute knowledge learning module by extracting attribute knowledge and incorporating it into feature representations, enabling the model to learn statistical information and adapt to different attributes. Moreover, we devise an attribute-image cross-attention module by calculating the correlation between attributes and images in the embedding space to capture spatial dependency information, thus selectively focusing on relevant regions while filtering irrelevant areas. Finally, a self-training mask improvement process is employed by generating pseudo-labels using high-confidence predictions to iteratively enhance the mask and segmentation. Experimental results on a benchmark medical image dataset demonstrate the superior performance of our method compared to state-of-the-art segmentation techniques in unsupervised scenarios.
Abstract (translated)
肺感染区域分割对于评估肺疾病严重程度至关重要。然而,现有的图像文本多模态方法通常依赖于劳动密集的注释来进行模型训练,这存在时间和专业技能方面的挑战。为解决这一问题,我们提出了一个新颖的属性知识引导的框架,用于无监督肺感染区域分割(AKGNet),它仅基于图像文本数据进行分割,没有任何掩码注释。AKGNet促进文本属性知识学习、属性图像跨注意力和高置信度基于伪标签探索。它可以学习统计信息,并捕获图像和文本属性在嵌入空间中的空间相关性,迭代地优化掩码以增强分割。具体来说,我们通过提取属性和将其融入特征表示中引入了一个文本属性知识学习模块,使模型能够学习统计信息和适应不同的属性。此外,我们还设计了一个属性图像跨注意力模块,通过计算嵌入空间中属性和图像之间的相关性来捕捉空间依赖信息,从而选择性地关注相关区域并过滤无关区域。最后,采用自训练掩码改进过程生成伪标签,以迭代增强掩码和分割。在医学图像数据集的基准上进行实验,结果表明,与最先进的无监督分割技术相比,我们的方法在无监督场景中的性能具有优越性。
URL
https://arxiv.org/abs/2404.11008