Paper Reading AI Learner

AKGNet: Attribute Knowledge-Guided Unsupervised Lung-Infected Area Segmentation

2024-04-17 02:36:02
Qing En, Yuhong Guo

Abstract

Lung-infected area segmentation is crucial for assessing the severity of lung diseases. However, existing image-text multi-modal methods typically rely on labour-intensive annotations for model training, posing challenges regarding time and expertise. To address this issue, we propose a novel attribute knowledge-guided framework for unsupervised lung-infected area segmentation (AKGNet), which achieves segmentation solely based on image-text data without any mask annotation. AKGNet facilitates text attribute knowledge learning, attribute-image cross-attention fusion, and high-confidence-based pseudo-label exploration simultaneously. It can learn statistical information and capture spatial correlations between image and text attributes in the embedding space, iteratively refining the mask to enhance segmentation. Specifically, we introduce a text attribute knowledge learning module by extracting attribute knowledge and incorporating it into feature representations, enabling the model to learn statistical information and adapt to different attributes. Moreover, we devise an attribute-image cross-attention module by calculating the correlation between attributes and images in the embedding space to capture spatial dependency information, thus selectively focusing on relevant regions while filtering irrelevant areas. Finally, a self-training mask improvement process is employed by generating pseudo-labels using high-confidence predictions to iteratively enhance the mask and segmentation. Experimental results on a benchmark medical image dataset demonstrate the superior performance of our method compared to state-of-the-art segmentation techniques in unsupervised scenarios.

Abstract (translated)

肺感染区域分割对于评估肺疾病严重程度至关重要。然而,现有的图像文本多模态方法通常依赖于劳动密集的注释来进行模型训练,这存在时间和专业技能方面的挑战。为解决这一问题,我们提出了一个新颖的属性知识引导的框架,用于无监督肺感染区域分割(AKGNet),它仅基于图像文本数据进行分割,没有任何掩码注释。AKGNet促进文本属性知识学习、属性图像跨注意力和高置信度基于伪标签探索。它可以学习统计信息,并捕获图像和文本属性在嵌入空间中的空间相关性,迭代地优化掩码以增强分割。具体来说,我们通过提取属性和将其融入特征表示中引入了一个文本属性知识学习模块,使模型能够学习统计信息和适应不同的属性。此外,我们还设计了一个属性图像跨注意力模块,通过计算嵌入空间中属性和图像之间的相关性来捕捉空间依赖信息,从而选择性地关注相关区域并过滤无关区域。最后,采用自训练掩码改进过程生成伪标签,以迭代增强掩码和分割。在医学图像数据集的基准上进行实验,结果表明,与最先进的无监督分割技术相比,我们的方法在无监督场景中的性能具有优越性。

URL

https://arxiv.org/abs/2404.11008

PDF

https://arxiv.org/pdf/2404.11008.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot