Paper Reading AI Learner

Detecting Adversarial Faces Using Only Real Face Self-Perturbations

2023-04-22 09:55:48
Qian Wang, Yongqin Xian, Hefei Ling, Jinyuan Zhang, Xiaorui Lin, Ping Li, Jiazhong Chen, Ning Yu

Abstract

Adversarial attacks aim to disturb the functionality of a target system by adding specific noise to the input samples, bringing potential threats to security and robustness when applied to facial recognition systems. Although existing defense techniques achieve high accuracy in detecting some specific adversarial faces (adv-faces), new attack methods especially GAN-based attacks with completely different noise patterns circumvent them and reach a higher attack success rate. Even worse, existing techniques require attack data before implementing the defense, making it impractical to defend newly emerging attacks that are unseen to defenders. In this paper, we investigate the intrinsic generality of adv-faces and propose to generate pseudo adv-faces by perturbing real faces with three heuristically designed noise patterns. We are the first to train an adv-face detector using only real faces and their self-perturbations, agnostic to victim facial recognition systems, and agnostic to unseen attacks. By regarding adv-faces as out-of-distribution data, we then naturally introduce a novel cascaded system for adv-face detection, which consists of training data self-perturbations, decision boundary regularization, and a max-pooling-based binary classifier focusing on abnormal local color aberrations. Experiments conducted on LFW and CelebA-HQ datasets with eight gradient-based and two GAN-based attacks validate that our method generalizes to a variety of unseen adversarial attacks.

Abstract (translated)

对抗攻击的目标是通过在输入样本中添加特定的噪声来干扰目标系统的功能和,当应用于人脸识别系统时,可能带来安全和鲁棒性的潜在威胁。尽管现有的防御技术能够在一些特定的对抗 Faces(adv-faces) 的检测方面实现高精度(adv-faces),但新的攻击方法特别是基于GAN的攻击方法,具有完全不同的噪声模式,绕过了这些防御技术并实现了更高的攻击成功率。更加糟糕的是,现有技术需要在实施防御之前收集攻击数据,因此无法有效地防御那些对防御者未知的新攻击。在本文中,我们研究了adv-faces 的固有一般性,并提出了通过对真实人脸进行三个启发式的噪声模式扰动来生成伪adv-faces 的方法。我们是第一位使用仅真实人脸及其自扰训练 an adv-face 检测器的人,并对受害者人脸识别系统及未知的攻击进行gnostic。将adv-faces 视为非分布数据,因此我们自然地引入了一个 novel 的级联系统以进行adv-face 检测,它包括训练数据自扰、决策边界 Regularization 和基于最大池化的分类器,专注于异常局部颜色变异。在 LFW 和CelebA-HQ 数据集上,使用八项梯度攻击和两个 GAN 攻击进行实验,证明了我们的方法可以适应各种不同的未知对抗攻击。

URL

https://arxiv.org/abs/2304.11359

PDF

https://arxiv.org/pdf/2304.11359.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot