Paper Reading AI Learner

Attacking Important Pixels for Anchor-free Detectors

2023-01-26 23:03:03
Yunxu Xie, Shu Hu, Xin Wang, Quanyu Liao, Bin Zhu, Xi Wu, Siwei Lyu

Abstract

Deep neural networks have been demonstrated to be vulnerable to adversarial attacks: subtle perturbation can completely change the prediction result. Existing adversarial attacks on object detection focus on attacking anchor-based detectors, which may not work well for anchor-free detectors. In this paper, we propose the first adversarial attack dedicated to anchor-free detectors. It is a category-wise attack that attacks important pixels of all instances of a category simultaneously. Our attack manifests in two forms, sparse category-wise attack (SCA) and dense category-wise attack (DCA), that minimize the $L_0$ and $L_\infty$ norm-based perturbations, respectively. For DCA, we present three variants, DCA-G, DCA-L, and DCA-S, that select a global region, a local region, and a semantic region, respectively, to attack. Our experiments on large-scale benchmark datasets including PascalVOC, MS-COCO, and MS-COCO Keypoints indicate that our proposed methods achieve state-of-the-art attack performance and transferability on both object detection and human pose estimation tasks.

Abstract (translated)

深度学习网络已经被证明是受到对抗攻击的脆弱对象:微小的扰动可以完全改变预测结果。针对目标检测的对抗攻击主要攻击基于锚的检测器,对于无锚检测器可能不起作用。在本文中,我们提出了第一个专门攻击无锚检测器的对抗攻击。它是一种按类别攻击,攻击所有类别实例的重要像素同时。我们的攻击表现为两种形式,稀疏按类别攻击(SCA)和密集按类别攻击(DCA),分别最小化基于L_0和L_infty范数的扰动。对于DCA,我们介绍了三个变体,DCA-G、DCA-L和DCA-S,分别选择全球区域、本地区域和语义区域进行攻击。我们在包括PascalVOC、MS-COCO和MS-COCO关键点的大型基准数据集上的实验表明,我们提出的方法在目标检测和人体姿态估计任务中实现最先进的攻击性能和可移植性。

URL

https://arxiv.org/abs/2301.11457

PDF

https://arxiv.org/pdf/2301.11457.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot