Abstract
Mounting evidence in explainability for artificial intelligence (XAI) research suggests that good explanations should be tailored to individual tasks and should relate to concepts relevant to the task. However, building task specific explanations is time consuming and requires domain expertise which can be difficult to integrate into generic XAI methods. A promising approach towards designing useful task specific explanations with domain experts is based on compositionality of semantic concepts. Here, we present a novel approach that enables domain experts to quickly create concept-based explanations for computer vision tasks intuitively via natural language. Leveraging recent progress in deep generative methods we propose to generate visual concept-based prototypes via text-to-image methods. These prototypes are then used to explain predictions of computer vision models via a simple k-Nearest-Neighbors routine. The modular design of CoProNN is simple to implement, it is straightforward to adapt to novel tasks and allows for replacing the classification and text-to-image models as more powerful models are released. The approach can be evaluated offline against the ground-truth of predefined prototypes that can be easily communicated also to domain experts as they are based on visual concepts. We show that our strategy competes very well with other concept-based XAI approaches on coarse grained image classification tasks and may even outperform those methods on more demanding fine grained tasks. We demonstrate the effectiveness of our method for human-machine collaboration settings in qualitative and quantitative user studies. All code and experimental data can be found in our GitHub $\href{this https URL}{repository}$.
Abstract (translated)
在人工智能(XAI)研究中,将证据适配到具体任务并进行解释是一个好方法,应与任务相关的概念相联系。然而,为任务定制解释需要花费时间,并且需要领域专业知识,这使得将领域专业知识整合到通用XAI方法中变得困难。设计有用的任务特定解释与领域专家合作是一种有前途的方法,基于语义概念的组合性。在这里,我们提出了一种新方法,使领域专家能够通过自然语言直观地创建基于概念的计算机视觉任务的解释。我们利用深度生成方法的最新进展,通过文本转图像方法生成视觉概念基原型。然后,通过简单的k-最近邻算法对计算机视觉模型的预测进行解释。CoProNN的模块化设计简单易用,很容易适应新任务,可以替换分类和文本转图像模型,因为它们基于更强大的模型。我们的策略在粗粒度图像分类任务上与基于概念的其他XAI方法竞争,甚至可能在更细粒度的任务上超过这些方法。我们在定性和定量用户研究中展示了我们方法的 effectiveness,所有代码和实验数据都可以在GitHub上找到。 <https://github.com/your-username>
URL
https://arxiv.org/abs/2404.14830