Paper Reading AI Learner

Embodied Active Defense: Leveraging Recurrent Feedback to Counter Adversarial Patches

2024-03-31 03:02:35
Lingxuan Wu, Xiao Yang, Yinpeng Dong, Liuwei Xie, Hang Su, Jun Zhu

Abstract

The vulnerability of deep neural networks to adversarial patches has motivated numerous defense strategies for boosting model robustness. However, the prevailing defenses depend on single observation or pre-established adversary information to counter adversarial patches, often failing to be confronted with unseen or adaptive adversarial attacks and easily exhibiting unsatisfying performance in dynamic 3D environments. Inspired by active human perception and recurrent feedback mechanisms, we develop Embodied Active Defense (EAD), a proactive defensive strategy that actively contextualizes environmental information to address misaligned adversarial patches in 3D real-world settings. To achieve this, EAD develops two central recurrent sub-modules, i.e., a perception module and a policy module, to implement two critical functions of active vision. These models recurrently process a series of beliefs and observations, facilitating progressive refinement of their comprehension of the target object and enabling the development of strategic actions to counter adversarial patches in 3D environments. To optimize learning efficiency, we incorporate a differentiable approximation of environmental dynamics and deploy patches that are agnostic to the adversary strategies. Extensive experiments demonstrate that EAD substantially enhances robustness against a variety of patches within just a few steps through its action policy in safety-critical tasks (e.g., face recognition and object detection), without compromising standard accuracy. Furthermore, due to the attack-agnostic characteristic, EAD facilitates excellent generalization to unseen attacks, diminishing the averaged attack success rate by 95 percent across a range of unseen adversarial attacks.

Abstract (translated)

深度神经网络对对抗性补丁的脆弱性激起了许多提高模型鲁棒性的防御策略。然而,现有的防御方法依赖于单个观察或预先确定的对抗性信息来对抗对抗性补丁,往往无法应对未见或自适应的对抗性攻击,并且在动态三维环境中表现出不令人满意的性能。受到人类主动感知和递归反馈机制的启发,我们开发了Embodied Active Defense(EAD),一种主动的防御策略,它积极地上下文化环境信息来解决三维现实场景中的错位对抗性补丁。为了实现这一目标,EAD开发了两个核心的循环子模块,即感知模块和策略模块,以实现主动视觉的两个关键功能。这些模型通过循环处理一系列的信念和观察,促进对目标对象的深入了解,并能够开发出针对三维环境中对抗性补丁的战略性行动。为了优化学习效率,我们引入了一种不同的环境动态的有条件近似,并部署对攻击策略无依赖的补丁。大量实验证明,EAD通过其动作策略在安全关键任务(如面部识别和物体检测)中显著增强了鲁棒性,而不会牺牲标准准确性。此外,由于攻击无关的特点,EAD有助于将未见攻击引导到很高的泛化水平,将未见攻击的平均成功率降低95%。

URL

https://arxiv.org/abs/2404.00540

PDF

https://arxiv.org/pdf/2404.00540.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot