Paper Reading AI Learner

Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples

2019-04-16 16:46:16
Zihao Liu, Qi Liu, Tao Liu, Nuo Xu, Xue Lin, Yanzhi Wang, Wujie Wen

Abstract

Image compression-based approaches for defending against the adversarial-example attacks, which threaten the safety use of deep neural networks (DNN), have been investigated recently. However, prior works mainly rely on directly tuning parameters like compression rate, to blindly reduce image features, thereby lacking guarantee on both defense efficiency (i.e. accuracy of polluted images) and classification accuracy of benign images, after applying defense methods. To overcome these limitations, we propose a JPEG-based defensive compression framework, namely "feature distillation", to effectively rectify adversarial examples without impacting classification accuracy on benign data. Our framework significantly escalates the defense efficiency with marginal accuracy reduction using a two-step method: First, we maximize malicious features filtering of adversarial input perturbations by developing defensive quantization in frequency domain of JPEG compression or decompression, guided by a semi-analytical method; Second, we suppress the distortions of benign features to restore classification accuracy through a DNN-oriented quantization refine process. Our experimental results show that proposed "feature distillation" can significantly surpass the latest input-transformation based mitigations such as Quilting and TV Minimization in three aspects, including defense efficiency (improve classification accuracy from $\sim20\%$ to $\sim90\%$ on adversarial examples), accuracy of benign images after defense ($\le1\%$ accuracy degradation), and processing time per image ($\sim259\times$ Speedup). Moreover, our solution can also provide the best defense efficiency ($\sim60\%$ accuracy) against the recent adaptive attack with least accuracy reduction ($\sim1\%$) on benign images when compared with other input-transformation based defense methods.

Abstract (translated)

基于图像压缩的防御深部神经网络(DNN)攻击的方法是近年来研究的热点。然而,以往的工作主要依靠对压缩率等参数的直接调整,盲目地降低图像的特征,从而在采用防御方法后,既不能保证防御效率(即被污染图像的精度)又不能保证良性图像的分类精度。为了克服这些局限性,我们提出了一种基于jpeg的防御压缩框架,即“特征提取”,以在不影响良性数据分类精度的情况下,有效地纠正对抗性示例。我们的框架使用两步方法显著提高了防御效率,降低了边际精度:首先,我们通过在jpeg压缩或解压缩的频域中开发防御量化,在半解析方法的指导下,最大限度地提高了对敌方输入扰动的恶意特征过滤;其次,我们支持ESS通过面向DNN的量化细化过程,对良性特征进行畸变,以恢复分类精度。实验结果表明,所提出的“特征提取”方法在防御效率(敌方实例分类精度由$sim20\%$提高到$sim90\%$s)、数据处理后的良性图像精度等三个方面均能显著优于最新的基于输入变换的缓解措施,如缝合和电视最小化。efense($le1\%$accuracy degrade)和每个图像的处理时间($sim259 imes$speedup)。此外,与其他基于输入变换的防御方法相比,我们的解决方案还可以提供针对最近自适应攻击的最佳防御效率($sim60\%$accuracy),在良性图像上的精度降低最小($sim1\%$)。

URL

https://arxiv.org/abs/1803.05787

PDF

https://arxiv.org/pdf/1803.05787.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot