Abstract
Deep neural networks (DNNs) have achieved tremendous success in many remote sensing (RS) applications. However, their vulnerability to the threat of adversarial perturbations should not be neglected. Unfortunately, current adversarial defense approaches in RS studies usually suffer from performance fluctuation and unnecessary re-training costs due to the need for prior knowledge of the adversarial perturbations among RS data. To circumvent these challenges, we propose a universal adversarial defense approach in RS imagery (UAD-RS) using pre-trained diffusion models to defend the common DNNs against multiple unknown adversarial attacks. Specifically, the generative diffusion models are first pre-trained on different RS datasets to learn generalized representations in various data domains. After that, a universal adversarial purification framework is developed using the forward and reverse process of the pre-trained diffusion models to purify the perturbations from adversarial samples. Furthermore, an adaptive noise level selection (ANLS) mechanism is built to capture the optimal noise level of the diffusion model that can achieve the best purification results closest to the clean samples according to their Frechet Inception Distance (FID) in deep feature space. As a result, only a single pre-trained diffusion model is needed for the universal purification of adversarial samples on each dataset, which significantly alleviates the re-training efforts for each attack setting and maintains high performance without the prior knowledge of adversarial perturbations. Experiments on four heterogeneous RS datasets regarding scene classification and semantic segmentation verify that UAD-RS outperforms state-of-the-art adversarial purification approaches with a universal defense against seven commonly existing adversarial perturbations.
Abstract (translated)
深度学习(DNN)在许多遥感(RS)应用中取得了巨大的成功,但是其对dversarial perturbations的威胁不应该被忽视。不幸的是,在RS研究中当前的dversarial防御方法通常因为需要对RS数据中的dversarial perturbations进行前置知识的需求而表现出性能波动和不必要的重新训练成本。为了克服这些挑战,我们提出了在RS图像中使用预先训练扩散模型的通用dversarial防御方法(UAD-RS),以保护常见的DNN免受多种未知的dversarial攻击。具体来说,先对不同的RS数据集进行预先训练,以学习在各种数据域中的通用表示。然后,使用预先训练扩散模型的 forward 和 reverse 过程来净化dversarial样本。此外,建立了自适应噪声水平选择机制(ANLS),以捕捉扩散模型的最佳噪声水平,该机制能够以清洁样本的深度特征空间中的卷积感知距离(FID)的最佳净化结果为目标实现最好的净化效果。因此,只需要在每个数据集上使用一个预先训练扩散模型来进行通用的dversarial样本净化,这显著减轻每个攻击设置下的重新训练努力,并且在没有dversarial perturbations的前置知识的情况下维持高水平的表现。关于场景分类和语义分割的四种不同RS数据集的实验证实了UAD-RS相对于最先进的dversarial净化方法以及通过通用防御对抗七种常见的dversarial干扰的优势。
URL
https://arxiv.org/abs/2307.16865