Paper Reading AI Learner

Flow AM: Generating Point Cloud Global Explanations by Latent Alignment

2024-04-29 14:57:16
Hanxiao Tan

Abstract

Although point cloud models have gained significant improvements in prediction accuracy over recent years, their trustworthiness is still not sufficiently investigated. In terms of global explainability, Activation Maximization (AM) techniques in the image domain are not directly transplantable due to the special structure of the point cloud models. Existing studies exploit generative models to yield global explanations that can be perceived by humans. However, the opacity of the generative models themselves and the introduction of additional priors call into question the plausibility and fidelity of the explanations. In this work, we demonstrate that when the classifier predicts different types of instances, the intermediate layer activations are differently activated, known as activation flows. Based on this property, we propose an activation flow-based AM method that generates global explanations that can be perceived without incorporating any generative model. Furthermore, we reveal that AM based on generative models fails the sanity checks and thus lack of fidelity. Extensive experiments show that our approach dramatically enhances the perceptibility of explanations compared to other AM methods that are not based on generative models. Our code is available at: this https URL

Abstract (translated)

尽管点云模型在近年来在预测准确性方面取得了显著的提高,但它们的可靠性仍没有充分调查。在全局可解释性方面,由于点云模型的特殊结构,图像域中的激活最大化(AM)技术是不可直接移植的。现有研究表明,通过利用生成模型生成全局解释,可以被人类感知。然而,生成模型的自身不透明性和引入附加假设导致了对解释的可信度和忠实性的怀疑。在本文中,我们证明了当分类器预测不同类型的实例时,中间层激活不同,称为激活流。基于这种特性,我们提出了一个基于激活流的自监督AM方法,可以生成无需包含任何生成模型的全局解释。此外,我们还发现基于生成模型的AM方法无法通过 sanity checks,因此缺乏可靠性。大量实验证明,与基于其他非生成模型(例如元学习)的AM方法相比,我们的方法显著增强了解释的可感知性。我们的代码可在此处访问:https:// this URL

URL

https://arxiv.org/abs/2404.18760

PDF

https://arxiv.org/pdf/2404.18760.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot