Paper Reading AI Learner

Shortcuts Arising from Contrast: Effective and Covert Clean-Label Attacks in Prompt-Based Learning

2024-03-30 20:02:36
Xiaopeng Xie, Ming Yan, Xiwen Zhou, Chenlong Zhao, Suli Wang, Yong Zhang, Joey Tianyi Zhou

Abstract

Prompt-based learning paradigm has demonstrated remarkable efficacy in enhancing the adaptability of pretrained language models (PLMs), particularly in few-shot scenarios. However, this learning paradigm has been shown to be vulnerable to backdoor attacks. The current clean-label attack, employing a specific prompt as a trigger, can achieve success without the need for external triggers and ensure correct labeling of poisoned samples, which is more stealthy compared to the poisoned-label attack, but on the other hand, it faces significant issues with false activations and poses greater challenges, necessitating a higher rate of poisoning. Using conventional negative data augmentation methods, we discovered that it is challenging to trade off between effectiveness and stealthiness in a clean-label setting. In addressing this issue, we are inspired by the notion that a backdoor acts as a shortcut and posit that this shortcut stems from the contrast between the trigger and the data utilized for poisoning. In this study, we propose a method named Contrastive Shortcut Injection (CSI), by leveraging activation values, integrates trigger design and data selection strategies to craft stronger shortcut features. With extensive experiments on full-shot and few-shot text classification tasks, we empirically validate CSI's high effectiveness and high stealthiness at low poisoning rates. Notably, we found that the two approaches play leading roles in full-shot and few-shot settings, respectively.

Abstract (translated)

基于提示的学习范式在增强预训练语言模型(PLMs)的适应性方面表现出了显著的效果,特别是在少样本场景中。然而,这种学习范式已经被证明容易受到后门攻击。当前的干净标签攻击通过使用特定的提示作为触发器,可以在不需要外部触发器的情况下实现成功,并确保正确标注的有毒样本,这比毒标签攻击更加隐秘,但另一方面,它面临假激活的问题,构成更大的挑战,需要更高的毒性率。通过传统的负数据增强方法,我们发现在一个干净标签设置中,有效性和隐秘性之间的平衡是困难的。为解决这一问题,我们受到了灵感来自于后门作为一个快捷方式的想法,并认为这一快捷方式源于触发器和用于毒化的数据之间的差异。在这项研究中,我们提出了名为 Contrastive Shortcut Injection(CSI)的方法,通过利用激活值,将触发器设计和数据选择策略集成在一起,构建出更强的快捷特征。在全面检测和少量文本分类任务的广泛实验中,我们通过经验验证 CSI 在低毒性率下的高效性和隐秘性。值得注意的是,我们发现两种方法在全面检测和少量文本分类场景中发挥着关键作用。

URL

https://arxiv.org/abs/2404.00461

PDF

https://arxiv.org/pdf/2404.00461.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot