Paper Reading AI Learner

Identifying Adversarially Attackable and Robust Samples

2023-01-30 13:58:14
Vyas Raina, Mark Gales

Abstract

This work proposes a novel perspective on adversarial attacks by introducing the concept of sample attackability and robustness. Adversarial attacks insert small, imperceptible perturbations to the input that cause large, undesired changes to the output of deep learning models. Despite extensive research on generating adversarial attacks and building defense systems, there has been limited research on understanding adversarial attacks from an input-data perspective. We propose a deep-learning-based method for detecting the most attackable and robust samples in an unseen dataset for an unseen target model. The proposed method is based on a neural network architecture that takes as input a sample and outputs a measure of attackability or robustness. The proposed method is evaluated using a range of different models and different attack methods, and the results demonstrate its effectiveness in detecting the samples that are most likely to be affected by adversarial attacks. Understanding sample attackability can have important implications for future work in sample-selection tasks. For example in active learning, the acquisition function can be designed to select the most attackable samples, or in adversarial training, only the most attackable samples are selected for augmentation.

Abstract (translated)

这项工作提出了一种新的视角,即引入样本攻击能力和鲁棒性的概念,以看待对抗攻击。对抗攻击在输入中插入微小的、不易察觉的扰动,导致深度学习模型的输出出现大量不希望出现的变化。尽管已经对生成对抗攻击和构建防御系统进行了广泛的研究,但对输入数据的理解从而忽略了。我们提出了一种基于深度学习的方法,用于在一个未访问的数据集上检测未访问的目标模型样本中攻击能力和鲁棒性最强的样本。该方法基于一个神经网络架构,以样本作为输入并输出攻击能力或鲁棒性的度量。该方法使用各种模型和攻击方法进行评价,结果证明它可以有效地检测受到对抗攻击影响的最大样本。理解样本攻击能力对于样本选择任务的未来工作具有重要的影响。例如,在主动学习中,可以设计选择最易受攻击样本的获取函数,或在对抗训练中,仅选择最易受攻击的样本进行增强。

URL

https://arxiv.org/abs/2301.12896

PDF

https://arxiv.org/pdf/2301.12896.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot