Abstract
Modern deep learning models often make predictions by focusing on irrelevant areas, leading to biased performance and limited generalization. Existing methods aimed at rectifying model attention require explicit labels for irrelevant areas or complex pixel-wise ground truth attention maps. We present CRAYON (Correcting Reasoning with Annotations of Yes Or No), offering effective, scalable, and practical solutions to rectify model attention using simple yes-no annotations. CRAYON empowers classical and modern model interpretation techniques to identify and guide model reasoning: CRAYON-ATTENTION directs classic interpretations based on saliency maps to focus on relevant image regions, while CRAYON-PRUNING removes irrelevant neurons identified by modern concept-based methods to mitigate their influence. Through extensive experiments with both quantitative and human evaluation, we showcase CRAYON's effectiveness, scalability, and practicality in refining model attention. CRAYON achieves state-of-the-art performance, outperforming 12 methods across 3 benchmark datasets, surpassing approaches that require more complex annotations.
Abstract (translated)
现代深度学习模型常常通过关注不相关的区域来进行预测,导致性能偏差和泛化能力有限。现有的旨在纠正模型注意力的方法需要明确标记出不相关区域或复杂的像素级地面真实注意图。我们提出了CRAYON(利用是或否的注释来纠正推理),提供了一种有效、可扩展且实用的解决方案,使用简单的“是”或“否”的标注来矫正模型注意力。CRAYON增强了经典和现代模型解释技术的能力,以识别并引导模型推理:CRAYON-ATTENTION基于显著性图的经典解释方法将注意力集中在相关图像区域上,而CRAYON-PRUNING则通过现代概念导向的方法移除被识别为不相关的神经元,从而减轻其影响。通过广泛的定量和人工评估实验,我们展示了CRAYON在优化模型注意力方面的有效性、可扩展性和实用性。CRAYON实现了最先进的性能,在3个基准数据集上超越了12种方法,并优于需要更复杂注释的方法。
URL
https://arxiv.org/abs/2410.22312