Paper Reading AI Learner

Concept Based Explanations and Class Contrasting

2025-02-05 18:10:02
Rudolf Herdt, Daniel Otero Baguer

Abstract

Explaining deep neural networks is challenging, due to their large size and non-linearity. In this paper, we introduce a concept-based explanation method, in order to explain the prediction for an individual class, as well as contrasting any two classes, i.e. explain why the model predicts one class over the other. We test it on several openly available classification models trained on ImageNet1K, as well as on a segmentation model trained to detect tumor in stained tissue samples. We perform both qualitative and quantitative tests. For example, for a ResNet50 model from pytorch model zoo, we can use the explanation for why the model predicts a class 'A' to automatically select six dataset crops where the model does not predict class 'A'. The model then predicts class 'A' again for the newly combined image in 71\% of the cases (works for 710 out of the 1000 classes). The code including an .ipynb example is available on git: this https URL.

Abstract (translated)

解释深度神经网络的预测结果颇具挑战,这主要是因为它们规模庞大且具有非线性特性。本文中,我们提出了一种基于概念的解释方法,旨在说明模型为何会为某个特定类别做出预测,并能够对比任意两个类别的区别,即解释为什么模型在给定输入时选择一个类别而非另一个。我们在几个公开可用的、训练于ImageNet1K数据集上的分类模型上测试了这种方法,同时也在用于检测染色组织样本中肿瘤区域的分割模型上进行了应用。我们进行了定性和定量两方面的测试。 例如,在使用pytorch模型库中的ResNet50模型时,我们可以利用解释方法来找出模型为何预测类别'A'的原因,并自动选择六个数据集中的样本,这些样本在不被分类为'A'的情况下由模型进行预测。然后,当将这六张图像合并成一张新的图像时,在71%的情况下(即针对1000个类中的710个),该模型再次预测类别为'A'。 相关的代码及一个.ipynb示例可以在以下git链接上找到:[此URL](this https URL)。

URL

https://arxiv.org/abs/2502.03422

PDF

https://arxiv.org/pdf/2502.03422.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot