Paper Reading AI Learner

Accuracy of TextFooler black box adversarial attacks on 01 loss sign activation neural network ensemble


Abstract

Recent work has shown the defense of 01 loss sign activation neural networks against image classification adversarial attacks. A public challenge to attack the models on CIFAR10 dataset remains undefeated. We ask the following question in this study: are 01 loss sign activation neural networks hard to deceive with a popular black box text adversarial attack program called TextFooler? We study this question on four popular text classification datasets: IMDB reviews, Yelp reviews, MR sentiment classification, and AG news classification. We find that our 01 loss sign activation network is much harder to attack with TextFooler compared to sigmoid activation cross entropy and binary neural networks. We also study a 01 loss sign activation convolutional neural network with a novel global pooling step specific to sign activation networks. With this new variation we see a significant gain in adversarial accuracy rendering TextFooler practically useless against it. We make our code freely available at \url{this https URL} and \url{this https URL}. Our work here suggests that 01 loss sign activation networks could be further developed to create fool proof models against text adversarial attacks.

Abstract (translated)

最近的工作表明,01损失符号激活神经网络对图像分类对抗攻击具有一定的防御能力。然而,在CIFAR10数据集上攻击这些模型 remains 未经挑战。在本研究中,我们问以下问题:01损失符号激活神经网络是否容易被名为TextFooler的流行黑盒文本对抗攻击程序欺骗?我们在四个流行的文本分类数据集上研究这个问题:IMDb评论、Yelp评论、MR情感分类和AG新闻分类。我们发现,与sigmoid激活交叉熵和二进制神经网络相比,我们的01损失符号激活网络在TextFooler上的攻击难度更大。我们还研究了一个新颖的全局池化步长的01损失符号激活卷积神经网络。通过这种新颖的变体,我们看到了显著的增加对抗准确率,使得TextFooler对它几乎没有任何用处。我们的代码目前可以从以下网址免费获取:\url{this https URL} 和 \url{this https URL}。本研究的结果表明,01损失符号激活神经网络可以进一步开发,以创建对文本对抗攻击具有充分保护的模型。

URL

https://arxiv.org/abs/2402.07347

PDF

https://arxiv.org/pdf/2402.07347.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot