Paper Reading AI Learner

Influencer Backdoor Attack on Semantic Segmentation

2023-03-21 17:45:38
Haoheng Lan, Jindong Gu, Philip Torr, Hengshuang Zhao

Abstract

When a small number of poisoned samples are injected into the training dataset of a deep neural network, the network can be induced to exhibit malicious behavior during inferences, which poses potential threats to real-world applications. While they have been intensively studied in classification, backdoor attacks on semantic segmentation have been largely overlooked. Unlike classification, semantic segmentation aims to classify every pixel within a given image. In this work, we explore backdoor attacks on segmentation models to misclassify all pixels of a victim class by injecting a specific trigger on non-victim pixels during inferences, which is dubbed Influencer Backdoor Attack (IBA). IBA is expected to maintain the classification accuracy of non-victim pixels and misleads classifications of all victim pixels in every single inference. Specifically, we consider two types of IBA scenarios, i.e., 1) Free-position IBA: the trigger can be positioned freely except for pixels of the victim class, and 2) Long-distance IBA: the trigger can only be positioned somewhere far from victim pixels, given the possible practical constraint. Based on the context aggregation ability of segmentation models, we propose techniques to improve IBA for the scenarios. Concretely, for free-position IBA, we propose a simple, yet effective Nearest Neighbor trigger injection strategy for poisoned sample creation. For long-distance IBA, we propose a novel Pixel Random Labeling strategy. Our extensive experiments reveal that current segmentation models do suffer from backdoor attacks, and verify that our proposed techniques can further increase attack performance.

Abstract (translated)

当一小部分有毒样本被注入到深度神经网络的训练数据中时,网络可能会被诱导在推断过程中表现出恶意行为,这可能对现实世界的应用构成潜在威胁。虽然在分类研究中已经对语义分割中的后进攻击进行了深入研究,但对语义分割中的中间件攻击却往往被忽视。与分类不同,语义分割的目标是对给定图像中的每个像素进行分类。在这项工作中,我们探讨了分割模型中的中间件攻击,以将非受害者像素的分类错误地归为受害者像素,并称之为影响者中间件攻击(IBA)。IBA旨在维持非受害者像素的分类准确性,并在每个推断中都误导所有受害者像素的分类。具体而言,我们考虑了两种IBA场景,即1)自由位置的IBA:触发器可以在非受害者像素之间自由放置,除了受害者像素之外,2)远距离的IBA:触发器只能放置在距离受害者像素很远的地方,考虑到可能的实际限制。基于分割模型的上下文聚合能力,我们提出了方法以改进IBA的场景。具体而言,对于自由位置的IBA,我们提出了一种简单但有效的相邻像素触发器注入策略,用于有毒样本的创建。对于远距离的IBA,我们提出了一种新颖的像素随机标签策略。我们的广泛实验表明,当前分割模型确实遭受了中间件攻击,并验证了我们提出的方法可以进一步增加攻击性能。

URL

https://arxiv.org/abs/2303.12054

PDF

https://arxiv.org/pdf/2303.12054.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot