Paper Reading AI Learner

FoolSDEdit: Deceptively Steering Your Edits Towards Targeted Attribute-aware Distribution

2024-02-06 04:56:43
Qi Zhou, Dongxia Wang, Tianlin Li, Zhihong Xu, Yang Liu, Kui Ren, Wenhai Wang, Qing Guo

Abstract

Guided image synthesis methods, like SDEdit based on the diffusion model, excel at creating realistic images from user inputs such as stroke paintings. However, existing efforts mainly focus on image quality, often overlooking a key point: the diffusion model represents a data distribution, not individual images. This introduces a low but critical chance of generating images that contradict user intentions, raising ethical concerns. For example, a user inputting a stroke painting with female characteristics might, with some probability, get male faces from SDEdit. To expose this potential vulnerability, we aim to build an adversarial attack forcing SDEdit to generate a specific data distribution aligned with a specified attribute (e.g., female), without changing the input's attribute characteristics. We propose the Targeted Attribute Generative Attack (TAGA), using an attribute-aware objective function and optimizing the adversarial noise added to the input stroke painting. Empirical studies reveal that traditional adversarial noise struggles with TAGA, while natural perturbations like exposure and motion blur easily alter generated images' attributes. To execute effective attacks, we introduce FoolSDEdit: We design a joint adversarial exposure and blur attack, adding exposure and motion blur to the stroke painting and optimizing them together. We optimize the execution strategy of various perturbations, framing it as a network architecture search problem. We create the SuperPert, a graph representing diverse execution strategies for different perturbations. After training, we obtain the optimized execution strategy for effective TAGA against SDEdit. Comprehensive experiments on two datasets show our method compelling SDEdit to generate a targeted attribute-aware data distribution, significantly outperforming baselines.

Abstract (translated)

指导图像合成方法,如基于扩散模型的SDEdit,在从用户输入的绘笔画创建逼真的图像方面表现出色。然而,现有努力主要关注图像质量,往往忽视了一个关键点:扩散模型表示数据分布,而不是单个图像。这导致生成图像与用户意图相矛盾的可能性较低,但存在伦理问题。例如,用户输入具有女性特征的绘笔画,在SDEdit中,有一定概率会生成具有男性特征的图像。为了揭示这个潜在的安全漏洞,我们旨在建立一个对抗攻击,迫使SDEdit生成与指定属性(例如女性)相符的特定数据分布,同时不改变输入的属性特征。我们提出了Targeted Attribute Generative Attack(TAGA),使用具有属性的目标函数和优化输入绘笔画的对抗噪声。实验研究表明,传统的对抗噪声很难与TAGA相比,而自然扰动(例如曝光和模糊)很容易改变生成的图像的属性。为了有效地执行攻击,我们引入了FoolSDEdit:我们设计了一个联合对抗曝光和模糊攻击,将曝光和模糊添加到绘笔画中,并一起优化它们。我们优化了各种扰动的执行策略,将其封装为网络架构搜索问题。我们创建了SuperPert,表示不同扰动执行策略的图形。在训练之后,我们获得了有效TAGA对SDEdit的优化执行策略。在两个数据集上的全面实验表明,我们的方法使SDEdit生成了针对属性的目标数据分布,显著优于基线。

URL

https://arxiv.org/abs/2402.03705

PDF

https://arxiv.org/pdf/2402.03705.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot