Paper Reading AI Learner

Fuzziness-tuned: Improving the Transferability of Adversarial Examples

2023-03-17 16:00:18
Xiangyuan Yang, Jie Lin, Hanlin Zhang, Xinyu Yang, Peng Zhao

Abstract

With the development of adversarial attacks, adversairal examples have been widely used to enhance the robustness of the training models on deep neural networks. Although considerable efforts of adversarial attacks on improving the transferability of adversarial examples have been developed, the attack success rate of the transfer-based attacks on the surrogate model is much higher than that on victim model under the low attack strength (e.g., the attack strength $\epsilon=8/255$). In this paper, we first systematically investigated this issue and found that the enormous difference of attack success rates between the surrogate model and victim model is caused by the existence of a special area (known as fuzzy domain in our paper), in which the adversarial examples in the area are classified wrongly by the surrogate model while correctly by the victim model. Then, to eliminate such enormous difference of attack success rates for improving the transferability of generated adversarial examples, a fuzziness-tuned method consisting of confidence scaling mechanism and temperature scaling mechanism is proposed to ensure the generated adversarial examples can effectively skip out of the fuzzy domain. The confidence scaling mechanism and the temperature scaling mechanism can collaboratively tune the fuzziness of the generated adversarial examples through adjusting the gradient descent weight of fuzziness and stabilizing the update direction, respectively. Specifically, the proposed fuzziness-tuned method can be effectively integrated with existing adversarial attacks to further improve the transferability of adverarial examples without changing the time complexity. Extensive experiments demonstrated that fuzziness-tuned method can effectively enhance the transferability of adversarial examples in the latest transfer-based attacks.

Abstract (translated)

随着dversarial攻击的发展,dversarial例子被广泛用于增强深度学习模型的训练稳定性。虽然dversarial攻击为了改善转移性而付出了很大的努力,但基于代用模型的转移攻击的攻击成功率在低攻击强度下比受害者模型高得多(例如,攻击强度$\epsilon=8/255$)。在本文中,我们首先系统研究了这个问题,发现代用模型和受害者模型之间的攻击成功率的巨大差异是由一个特殊区域的存在引起的,这个区域被称为模糊领域(在我们的文章中称为模糊域),在这个区域内,代用模型将该区域中的dversarial例子错误地分类,而受害者模型正确地分类。为了消除这种攻击成功率的差异,以信任度量机制和温度度量机制为基础的模糊调整方法被提出,以确保生成的dversarial例子能够有效地从模糊领域跳过。信任度量机制和温度度量机制可以通过调整模糊梯度大小和稳定更新方向来协同调整生成的dversarial例子的模糊度。具体来说,提出的模糊调整方法可以有效地与现有的dversarial攻击系统集成,以进一步改善dversarial例子的转移性,而无需改变时间复杂度。广泛的实验表明,模糊调整方法可以 effectively增强最新的基于转移攻击的dversarial例子的转移性。

URL

https://arxiv.org/abs/2303.10078

PDF

https://arxiv.org/pdf/2303.10078.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot