Paper Reading AI Learner

NeSyFOLD: A System for Generating Logic-based Explanations from Convolutional Neural Networks

2023-01-30 05:08:05
Parth Padalkar, Huaduo Wang, Gopal Gupta

Abstract

We present a novel neurosymbolic system called NeSyFOLD that classifies images while providing a logic-based explanation of the classification. NeSyFOLD's training process is as follows: (i) We first pre-train a CNN on the input image dataset and extract activations of the last layer filters as binary values; (ii) Next, we use the FOLD-SE-M rule-based machine learning algorithm to generate a logic program that can classify an image -- represented as a vector of binary activations corresponding to each filter -- while producing a logical explanation. The rules generated by the FOLD-SE-M algorithm have filter numbers as predicates. We use a novel algorithm that we have devised for automatically mapping the CNN filters to semantic concepts in the images. This mapping is used to replace predicate names (filter numbers) in the rule-set with corresponding semantic concept labels. The resulting rule-set is highly interpretable, and can be intuitively understood by humans. We compare our NeSyFOLD system with the ERIC system that uses a decision-tree like algorithm to obtain the rules. Our system has the following advantages over ERIC: (i) NeSyFOLD generates smaller rule-sets without compromising on the accuracy and fidelity; (ii) NeSyFOLD generates the mapping of filter numbers to semantic labels automatically.

Abstract (translated)

我们提出了一种名为NeSyFOLD的新神经符号系统,它能够分类图像,同时提供基于逻辑的分类解释。NeSyFOLD的训练过程如下:(i)我们先在输入图像数据集上 pre-训练卷积神经网络,并提取最后一个卷积层过滤器的激活值作为二进制值;(ii)接着,我们使用 FOLD-SE-M 规则based机器学习算法生成一个逻辑程序,该程序能够分类图像,将其表示为每个过滤器对应的二进制激活向量,同时提供基于逻辑的解释。FOLD-SE-M 算法生成的规则具有过滤器编号作为谓词。我们采用了一种我们设计的独特算法,用于自动将卷积神经网络过滤器映射到图像中的语义概念。该映射用于替换规则表中谓词名称(过滤器编号)对应的语义概念标签。 resulting 规则集具有很高的解释性,人类能够直觉理解。我们比较了我们的NeSyFOLD系统与ERIC系统,后者使用决策树算法获取规则。我们的系统相较于Ericc有以下优势:(i) NeSyFOLD生成规则集较小,而不会牺牲准确性和忠实度;(ii) NeSyFOLD能够自动将过滤器编号映射到语义标签。

URL

https://arxiv.org/abs/2301.12667

PDF

https://arxiv.org/pdf/2301.12667.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot