Paper Reading AI Learner

CoProNN: Concept-based Prototypical Nearest Neighbors for Explaining Vision Models

2024-04-23 08:32:38
Teodor Chiaburu, Frank Haußer, Felix Bießmann

Abstract

Mounting evidence in explainability for artificial intelligence (XAI) research suggests that good explanations should be tailored to individual tasks and should relate to concepts relevant to the task. However, building task specific explanations is time consuming and requires domain expertise which can be difficult to integrate into generic XAI methods. A promising approach towards designing useful task specific explanations with domain experts is based on compositionality of semantic concepts. Here, we present a novel approach that enables domain experts to quickly create concept-based explanations for computer vision tasks intuitively via natural language. Leveraging recent progress in deep generative methods we propose to generate visual concept-based prototypes via text-to-image methods. These prototypes are then used to explain predictions of computer vision models via a simple k-Nearest-Neighbors routine. The modular design of CoProNN is simple to implement, it is straightforward to adapt to novel tasks and allows for replacing the classification and text-to-image models as more powerful models are released. The approach can be evaluated offline against the ground-truth of predefined prototypes that can be easily communicated also to domain experts as they are based on visual concepts. We show that our strategy competes very well with other concept-based XAI approaches on coarse grained image classification tasks and may even outperform those methods on more demanding fine grained tasks. We demonstrate the effectiveness of our method for human-machine collaboration settings in qualitative and quantitative user studies. All code and experimental data can be found in our GitHub $\href{this https URL}{repository}$.

Abstract (translated)

在人工智能(XAI)研究中,将证据适配到具体任务并进行解释是一个好方法,应与任务相关的概念相联系。然而,为任务定制解释需要花费时间,并且需要领域专业知识,这使得将领域专业知识整合到通用XAI方法中变得困难。设计有用的任务特定解释与领域专家合作是一种有前途的方法,基于语义概念的组合性。在这里,我们提出了一种新方法,使领域专家能够通过自然语言直观地创建基于概念的计算机视觉任务的解释。我们利用深度生成方法的最新进展,通过文本转图像方法生成视觉概念基原型。然后,通过简单的k-最近邻算法对计算机视觉模型的预测进行解释。CoProNN的模块化设计简单易用,很容易适应新任务,可以替换分类和文本转图像模型,因为它们基于更强大的模型。我们的策略在粗粒度图像分类任务上与基于概念的其他XAI方法竞争,甚至可能在更细粒度的任务上超过这些方法。我们在定性和定量用户研究中展示了我们方法的 effectiveness,所有代码和实验数据都可以在GitHub上找到。 <https://github.com/your-username>

URL

https://arxiv.org/abs/2404.14830

PDF

https://arxiv.org/pdf/2404.14830.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot