Abstract
While traditional self-supervised learning methods improve performance and robustness across various medical tasks, they rely on single-vector embeddings that may not capture fine-grained concepts such as anatomical structures or organs. The ability to identify such concepts and their characteristics without supervision has the potential to improve pre-training methods, and enable novel applications such as fine-grained image retrieval and concept-based outlier detection. In this paper, we introduce ConceptVAE, a novel pre-training framework that detects and disentangles fine-grained concepts from their style characteristics in a self-supervised manner. We present a suite of loss terms and model architecture primitives designed to discretise input data into a preset number of concepts along with their local style. We validate ConceptVAE both qualitatively and quantitatively, demonstrating its ability to detect fine-grained anatomical structures such as blood pools and septum walls from 2D cardiac echocardiographies. Quantitatively, ConceptVAE outperforms traditional self-supervised methods in tasks such as region-based instance retrieval, semantic segmentation, out-of-distribution detection, and object detection. Additionally, we explore the generation of in-distribution synthetic data that maintains the same concepts as the training data but with distinct styles, highlighting its potential for more calibrated data generation. Overall, our study introduces and validates a promising new pre-training technique based on concept-style disentanglement, opening multiple avenues for developing models for medical image analysis that are more interpretable and explainable than black-box approaches.
Abstract (translated)
虽然传统的自监督学习方法在各种医疗任务中提高了性能和鲁棒性,但它们依赖于单一向量嵌入,这可能无法捕捉到精细的概念,例如解剖结构或器官。能够在无监督的情况下识别这些概念及其特征的能力有望改进预训练方法,并实现诸如细粒度图像检索和基于概念的异常检测等新型应用。在本文中,我们介绍了ConceptVAE,这是一种新颖的自监督预训练框架,可以检测并分离出从其风格特性中的细微概念。我们提出了一系列损失项和模型架构基本元素,旨在将输入数据离散化为预定数量的概念及其局部风格。我们在定性和定量上验证了ConceptVAE的能力,展示了它能够从2D心脏超声心动图中识别精细的解剖结构,如血液池和隔壁。在量化方面,ConceptVAE在区域实例检索、语义分割、分布外检测和目标检测等任务中优于传统的自监督方法。此外,我们还探讨了生成与训练数据具有相同概念但风格不同的分布内合成数据的可能性,强调其在更精确的数据生成方面的潜力。总的来说,我们的研究介绍并验证了一种基于概念-风格分离的新颖预训练技术,为开发比黑盒方法更具可解释性和可解释性的医疗图像分析模型开辟了多种途径。
URL
https://arxiv.org/abs/2502.01335