Concept-Centric Transformers: Concept Transformers with Object-Centric Concept Learning for Interpretability

Abstract
Abstract (translated)
URL
PDF

Abstract

Attention mechanisms have greatly improved the performance of deep-learning models on visual, NLP, and multimodal tasks while also providing tools to aid in the model's interpretability. In particular, attention scores over input regions or concrete image features can be used to measure how much the attended elements contribute to the model inference. The recently proposed Concept Transformer (CT) generalizes the Transformer attention mechanism from such low-level input features to more abstract, intermediate-level latent concepts that better allow human analysts to more directly assess an explanation for the reasoning of the model about any particular output classification. However, the concept learning employed by CT implicitly assumes that across every image in a class, each image patch makes the same contribution to concepts that characterize membership in that class. Instead of using the CT's image-patch-centric concepts, object-centric concepts could lead to better classification performance as well as better explainability. Thus, we propose Concept-Centric Transformers (CCT), a new family of concept transformers that provides more robust explanations and performance by integrating a novel concept-extraction module based on object-centric learning. We test our proposed CCT against the CT and several other existing approaches on classification problems for MNIST (odd/even), CIFAR100 (super-classes), and CUB-200-2011 (bird species). Our experiments demonstrate that CCT not only achieves significantly better classification accuracy than all selected benchmark classifiers across all three of our test problems, but it generates more consistent concept-based explanations of classification output when compared to CT.

Abstract (translated)

注意力机制已经极大地改进了深度学习模型在视觉、自然语言处理和多任务任务中的表现，同时提供了工具来帮助模型的解释性。特别是，注意力得分 over 输入区域或具体图像特征可以使用来衡量被关注元素对模型推理的贡献。最近提出的概念Transformer(CT)将Transformer注意力机制从这样的低级别输入特征扩展到更抽象、中等级别的潜在概念，更好地允许人类分析员更直接评估模型对于任何特定输出分类推理的解释。然而，CT所使用的概念学习隐含地假设在每个类中的每个图像 patch 都对概念“属于”该类的特征作出相同贡献。相反，不使用CT的图像 patch 中心概念，对象中心概念可能会导致更好的分类性能和更好的解释性。因此，我们提出了概念中心Transformer(CCT)，一个新型的概念Transformer转换器家族，通过基于对象中心学习的 novel 概念提取模块，集成了一个独特的概念提取模块。我们对MNIST(奇偶性)、CIFAR100(超类)、CUB-200-2011(鸟类物种)等训练问题进行了CT和多个其他现有方法的比较测试，我们的实验结果表明，CCT不仅比我们测试的三个所有精选基准分类器在所有三个测试问题上实现了更好的分类精度，而且生成分类输出的概念基于解释更一致。

URL

https://arxiv.org/abs/2305.15775

PDF

https://arxiv.org/pdf/2305.15775.pdf