Mimic and Fool: A Task Agnostic Adversarial Attack

Abstract
Abstract (translated)
URL
PDF

Abstract

At present, adversarial attacks are designed in a task-specific fashion. However, for downstream computer vision tasks such as image captioning, image segmentation etc., the current deep learning systems use an image classifier like VGG16, ResNet50, Inception-v3 etc. as a feature extractor. Keeping this in mind, we propose Mimic and Fool, a task agnostic adversarial attack. Given a feature extractor, the proposed attack finds an adversarial image which can mimic the image feature of the original image. This ensures that the two images give the same (or similar) output regardless of the task. We randomly select 1000 MSCOCO validation images for experimentation. We perform experiments on two image captioning models, Show and Tell, Show Attend and Tell and one VQA model, namely, end-to-end neural module network (N2NMN). The proposed attack achieves success rate of 74.0%, 81.0% and 89.6% for Show and Tell, Show Attend and Tell and N2NMN respectively. We also propose a slight modification to our attack to generate natural-looking adversarial images. In addition, it is shown that the proposed attack also works for invertible architecture. Since Mimic and Fool only requires information about the feature extractor of the model, it can be considered as a gray-box attack.

Abstract (translated)

目前，对抗性攻击是以任务特定的方式设计的。然而，对于下游计算机视觉任务，如图像字幕、图像分割等，当前的深度学习系统使用图像分类器，如vgg16、resnet50、inception-v3等作为特征抽取器。记住这一点，我们提出模仿和愚弄，一个任务不可知论的对手攻击。在给定特征抽取器的情况下，该攻击会找到一个能模拟原始图像特征的敌方图像。这样可以确保两个图像提供相同（或相似）的输出，而不管任务是什么。我们随机选择1000个MSCOCO验证图像进行实验。我们在两个图像字幕模型上进行了实验，分别显示和讲述、显示和讲述以及一个VQA模型，即端到端神经模块网络（N2NMN）。本次攻击的成功率分别为74.0%、81.0%和89.6%。我们还建议对我们的攻击进行轻微修改，以生成自然的对抗图像。此外，研究表明，所提出的攻击也适用于可逆体系结构。由于mimic和傻子只需要模型的特征抽取器的信息，因此可以将其视为灰盒攻击。

URL

https://arxiv.org/abs/1906.04606

PDF

https://arxiv.org/pdf/1906.04606.pdf