Abstract
Adversarial attacks on machine learning models have seen increasing interest in the past years. By making only subtle changes to the input of a convolutional neural network, the output of the network can be swayed to output a completely different result. The first attacks did this by changing pixel values of an input image slightly to fool a classifier to output the wrong class. Other approaches have tried to learn "patches" that can be applied to an object to fool detectors and classifiers. Some of these approaches have also shown that these attacks are feasible in the real-world, i.e. by modifying an object and filming it with a video camera. However, all of these approaches target classes that contain almost no intra-class variety (e.g. stop signs). The known structure of the object is then used to generate an adversarial patch on top of it. In this paper, we present an approach to generate adversarial patches to targets with lots of intra-class variety, namely persons. The goal is to generate a patch that is able successfully hide a person from a person detector. An attack that could for instance be used maliciously to circumvent surveillance systems, intruders can sneak around undetected by holding a small cardboard plate in front of their body aimed towards the surveillance camera. From our results we can see that our system is able significantly lower the accuracy of a person detector. Our approach also functions well in real-life scenarios where the patch is filmed by a camera. To the best of our knowledge we are the first to attempt this kind of attack on targets with a high level of intra-class variety like persons.
Abstract (translated)
在过去的几年里,对机器学习模式的对抗性攻击越来越引起人们的兴趣。通过对卷积神经网络的输入进行细微的改变,网络的输出可以被摆动以输出完全不同的结果。第一次攻击是通过稍微改变输入图像的像素值来愚弄分类器以输出错误的类。其他的方法试图学习“补丁”,可以应用到一个对象,以愚弄检测器和分类器。其中一些方法还表明,这些攻击在现实世界中是可行的,即通过修改对象并用摄像机拍摄。然而,所有这些方法都是针对几乎不包含类内变化的类(例如停止标志)。然后,该对象的已知结构被用来在其上生成一个敌对的补丁。
URL
https://arxiv.org/abs/1904.08653