Paper Reading AI Learner

COIN: Counterfactual inpainting for weakly supervised semantic segmentation for medical images

2024-04-19 12:09:49
Dmytro Shvetsov, Joonas Ariva, Marharyta Domnich, Raul Vicente, Dmytro Fishman

Abstract

Deep learning is dramatically transforming the field of medical imaging and radiology, enabling the identification of pathologies in medical images, including computed tomography (CT) and X-ray scans. However, the performance of deep learning models, particularly in segmentation tasks, is often limited by the need for extensive annotated datasets. To address this challenge, the capabilities of weakly supervised semantic segmentation are explored through the lens of Explainable AI and the generation of counterfactual explanations. The scope of this research is development of a novel counterfactual inpainting approach (COIN) that flips the predicted classification label from abnormal to normal by using a generative model. For instance, if the classifier deems an input medical image X as abnormal, indicating the presence of a pathology, the generative model aims to inpaint the abnormal region, thus reversing the classifier's original prediction label. The approach enables us to produce precise segmentations for pathologies without depending on pre-existing segmentation masks. Crucially, image-level labels are utilized, which are substantially easier to acquire than creating detailed segmentation masks. The effectiveness of the method is demonstrated by segmenting synthetic targets and actual kidney tumors from CT images acquired from Tartu University Hospital in Estonia. The findings indicate that COIN greatly surpasses established attribution methods, such as RISE, ScoreCAM, and LayerCAM, as well as an alternative counterfactual explanation method introduced by Singla et al. This evidence suggests that COIN is a promising approach for semantic segmentation of tumors in CT images, and presents a step forward in making deep learning applications more accessible and effective in healthcare, where annotated data is scarce.

Abstract (translated)

深度学习正在深刻地改变医学影像和放射学领域,以前所未有的方式识别医学图像中的疾病,包括计算机断层扫描(CT)和X光扫描。然而,深度学习模型的性能,尤其是在分割任务中,常常受到需要大量注释数据的需求的限制。为解决这个问题,通过 Explainable AI 和生成反事实解释来探索弱监督语义分割模型的能力。这项研究旨在开发一种新颖的逆向修复方法(COIN),通过使用生成模型在预测分类标签异常的情况下,将预测分类标签从异常转为正常。例如,如果分类器认为输入医学图像X异常,表示存在疾病,生成模型旨在修复异常区域,从而反转分类器的原始预测标签。该方法使我们能够在不依赖预先存在的分割掩码的情况下精确地分割出疾病。关键的是,图像级标签被利用,这比创建详细的分割掩码要容易得多。该方法的效果由从爱沙尼亚图尔图大学医院的CT图像中分割出合成目标和实际肾肿瘤来证明。研究结果表明,COIN远远超过了已有的归因方法,如RISE、ScoreCAM和LayerCAM,以及Singla等人提出的另一种反事实解释方法。这一证据表明,COIN是用于CT图像肿瘤语义分割的有前景的方法,并为在医疗保健中使深度学习应用更具有可行性和效果铺平了道路,而注释数据又稀缺。

URL

https://arxiv.org/abs/2404.12832

PDF

https://arxiv.org/pdf/2404.12832.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot