Interpretable Network Visualizations: A Human-in-the-Loop Approach for Post-hoc Explainability of CNN-based Image Classification

Abstract
Abstract (translated)
URL
PDF

Abstract

Transparency and explainability in image classification are essential for establishing trust in machine learning models and detecting biases and errors. State-of-the-art explainability methods generate saliency maps to show where a specific class is identified, without providing a detailed explanation of the model's decision process. Striving to address such a need, we introduce a post-hoc method that explains the entire feature extraction process of a Convolutional Neural Network. These explanations include a layer-wise representation of the features the model extracts from the input. Such features are represented as saliency maps generated by clustering and merging similar feature maps, to which we associate a weight derived by generalizing Grad-CAM for the proposed methodology. To further enhance these explanations, we include a set of textual labels collected through a gamified crowdsourcing activity and processed using NLP techniques and Sentence-BERT. Finally, we show an approach to generate global explanations by aggregating labels across multiple images.

Abstract (translated)

透明度和可解释性在图像分类中至关重要，用于建立对机器学习模型的信任并检测偏见和错误。最先进的可解释性方法生成确切显示特定类别的 saliency 地图，而不会提供模型决策过程的详细解释。为了解决这个问题，我们引入了一种后置方法，该方法解释了卷积神经网络（CNN）的完整特征提取过程。这些解释包括从输入中提取的每个层的特征的层级表示。这些特征以通过聚类和合并类似特征图生成的 saliency 地图的形式表示，并附有通过扩展 Grad-CAM 获得的权重。为了进一步增强这些解释，我们在活动中通过游戏化众包活动收集了一组文本标签，并使用 NLP 技术和 Sentence-BERT 对这些标签进行处理。最后，我们展示了通过聚合多个图像上的标签来生成全局解释的方法。

URL

https://arxiv.org/abs/2405.03301

PDF

https://arxiv.org/pdf/2405.03301.pdf

Interpretable Network Visualizations: A Human-in-the-Loop Approach for Post-hoc Explainability of CNN-based Image Classification

Abstract

Abstract (translated)

URL

PDF Copy

PDF