Pre-training image representations from the raw text about images enables zero-shot vision transfer to downstream tasks. Through pre-training on millions of samples collected from the internet, multimodal foundation models, such as CLIP, produce state-of-the-art zero-shot results that often reach competitiveness with fully supervised methods without the need for task-specific training. Besides the encouraging performance on classification accuracy, it is reported that these models close the robustness gap by matching the performance of supervised models trained on ImageNet under natural distribution shift. Because robustness is critical to real-world applications, especially safety-critical ones, in this paper, we present a comprehensive evaluation based on a large-scale robustness benchmark covering 7 natural, 3 synthetic distribution shifts, and 11 adversarial attacks. We use CLIP as a pilot study. We show that CLIP leads to a significant robustness drop compared to supervised ImageNet models on our benchmark, especially under synthetic distribution shift and adversarial attacks. Furthermore, data overlap analysis suggests that the observed robustness under natural distribution shifts could be attributed, at least in part, to data overlap. In summary, our evaluation shows a comprehensive evaluation of robustness is necessary; and there is a significant need to improve the robustness of zero-shot multimodal models.
通过从原始文本中提取图像预训练图像表示,使得零散 shot 视觉传输下游任务成为可能。通过从互联网收集数百万个样本进行预训练,多模态基础模型(如 CLIP)产生了最先进的零散 shot 结果,通常可以达到与无需任务特定训练的全监督方法相媲美的水平。除了分类准确度令人鼓舞的结果之外,据报道,这些模型通过在自然分布漂移下训练监督模型与 ImageNet 上的监督模型相匹敌,从而缩小了鲁棒性差距。因为鲁棒性对现实世界的应用(尤其是关键应用)至关重要,尤其是在本文中,我们基于覆盖7个自然、3个合成分布漂移和11个对抗攻击的大型鲁棒性基准进行全面评估。我们使用 CLIP 作为试点研究。我们发现,CLIP 在我们的基准上导致监督 ImageNet 模型在合成分布漂移和对抗攻击方面的鲁棒性显著下降。此外,数据重叠分析表明,观察到的鲁棒性在自然分布漂移上可能是由数据重叠造成的。总之,我们的评估表明,对鲁棒性的全面评估是必要的;提高零散 shot 多模态模型的鲁棒性具有重要的意义。
https://arxiv.org/abs/2403.10499
Diffusion-based audio and music generation models commonly generate music by constructing an image representation of audio (e.g., a mel-spectrogram) and then converting it to audio using a phase reconstruction model or vocoder. Typical vocoders, however, produce monophonic audio at lower resolutions (e.g., 16-24 kHz), which limits their effectiveness. We propose MusicHiFi -- an efficient high-fidelity stereophonic vocoder. Our method employs a cascade of three generative adversarial networks (GANs) that convert low-resolution mel-spectrograms to audio, upsamples to high-resolution audio via bandwidth expansion, and upmixes to stereophonic audio. Compared to previous work, we propose 1) a unified GAN-based generator and discriminator architecture and training procedure for each stage of our cascade, 2) a new fast, near downsampling-compatible bandwidth extension module, and 3) a new fast downmix-compatible mono-to-stereo upmixer that ensures the preservation of monophonic content in the output. We evaluate our approach using both objective and subjective listening tests and find our approach yields comparable or better audio quality, better spatialization control, and significantly faster inference speed compared to past work. Sound examples are at this https URL.
扩散基于音频和音乐的生成模型通常通过构建音频的图像表示(例如,一个频谱图)并使用相位重建模型或 vocoder 将它转换为音频来生成音乐。然而,典型的 vocoder 产生的音频在较低的分辨率(例如 16-24 kHz)上,这限制了它们的有效性。我们提出 MusicHiFi -- 一个高效的高保真立体声 vocoder。我们的方法采用了一个级联的三个生成对抗网络(GANs)来将低分辨率频谱图转换为音频,通过带宽扩展将低分辨率音频升级到高分辨率音频,并使用上混合器将单声道音频转换为立体声音频。与之前的工作相比,我们提出了以下1) 对于我们级联的每个阶段,使用统一的 GAN 生成器和判别器架构以及训练程序;2) 一个新型的快速、近降采样兼容的带宽扩展模块;3) 一个快速降采样兼容的单声道到立体声上混合器,确保在输出中保留单声道内容的完整性。我们通过客观和主观听测试评估了我们的方法,发现我们的方法产生了相当不错的音频质量、更好的空间定位控制,以及比过去工作显著更快的推理速度。音频示例可以在以下链接中找到:https://www.example.com/
https://arxiv.org/abs/2403.10493
Mitigating hallucinations of Large Multi-modal Models(LMMs) is crucial to enhance their reliability for general-purpose assistants. This paper shows that such hallucinations of LMMs can be significantly exacerbated by preceding user-system dialogues. To precisely measure this, we first present an evaluation benchmark by extending popular multi-modal benchmark datasets with prepended hallucinatory dialogues generated by our novel Adversarial Question Generator, which can automatically generate image-related yet adversarial dialogues by adopting adversarial attacks on LMMs. On our benchmark, the zero-shot performance of state-of-the-art LMMs dropped significantly for both the VQA and Captioning tasks. Next, we further reveal this hallucination is mainly due to the prediction bias toward preceding dialogues rather than visual content. To reduce this bias, we propose Adversarial Instruction Tuning that robustly fine-tunes LMMs on augmented multi-modal instruction-following datasets with hallucinatory dialogues. Extensive experiments show that our proposed approach successfully reduces dialogue hallucination while maintaining or even improving performance.
缓解大型多模态模型(LMMs)的幻觉对于增强其对于通用助手设备的可靠性至关重要。本文表明,LMMs的前用户-系统对话可能会显著加剧这种幻觉。为了准确测量这一点,我们首先通过扩展流行的多模态基准数据集,使用我们新颖的对抗性问题生成器生成的附带幻觉对话,该生成器可以通过对LMMs的对抗攻击来生成与图像相关的 adversarial 对话。在我们的基准上,最先进的 LMM 的零散性能对于 both VQA 和 Captioning 任务都下降了显著。接下来,我们进一步表明,这种幻觉主要是由先前的对话预测偏差导致的,而不是视觉内容。为了减少这种偏见,我们提出了对抗指令调整,它在使用增强多模态指令跟随数据集上对 LMMs 进行鲁棒微调的同时,通过附带幻觉对话进行调整。大量实验证明,我们提出的方法在保持或甚至提高性能的同时,成功地减少了对话幻觉。
https://arxiv.org/abs/2403.10492
Enhancing the robustness of deep learning models, particularly in the realm of vision transformers (ViTs), is crucial for their real-world deployment. In this work, we provide a finetuning approach to enhance the robustness of vision transformers inspired by the concept of nullspace from linear algebra. Our investigation centers on whether a vision transformer can exhibit resilience to input variations akin to the nullspace property in linear mappings, implying that perturbations sampled from this nullspace do not influence the model's output when added to the input. Firstly, we show that for many pretrained ViTs, a non-trivial nullspace exists due to the presence of the patch embedding layer. Secondly, as nullspace is a concept associated with linear algebra, we demonstrate that it is possible to synthesize approximate nullspace elements for the non-linear blocks of ViTs employing an optimisation strategy. Finally, we propose a fine-tuning strategy for ViTs wherein we augment the training data with synthesized approximate nullspace noise. After finetuning, we find that the model demonstrates robustness to adversarial and natural image perbutations alike.
增强深度学习模型的稳健性,特别是视觉Transformer(ViTs)领域,对其实际部署至关重要。在这项工作中,我们提出了一种灵感来自线性代数中的零空间概念的微调方法,以增强视觉Transformer的稳健性。我们的研究集中在是否一个视觉Transformer可以表现出类似于零空间属性的输入变化抗性,这意味着从零空间中采样到的扰动在添加到输入时不会影响模型的输出。首先,我们证明了对于许多预训练的ViT,由于存在补丁嵌入层,存在非平凡零空间。其次,由于零空间与线性代数有关,我们证明了可以使用优化策略合成ViT的的非线性块的近似零空间元素。最后,我们提出了一种ViT的微调策略,即通过合成近似零空间噪声来增加训练数据。经过微调后,我们发现模型对各种类型的图像扰动都表现出抗性。
https://arxiv.org/abs/2403.10476
Shadow removal is a task aimed at erasing regional shadows present in images and reinstating visually pleasing natural scenes with consistent illumination. While recent deep learning techniques have demonstrated impressive performance in image shadow removal, their robustness against adversarial attacks remains largely unexplored. Furthermore, many existing attack frameworks typically allocate a uniform budget for perturbations across the entire input image, which may not be suitable for attacking shadow images. This is primarily due to the unique characteristic of spatially varying illumination within shadow images. In this paper, we propose a novel approach, called shadow-adaptive adversarial attack. Different from standard adversarial attacks, our attack budget is adjusted based on the pixel intensity in different regions of shadow images. Consequently, the optimized adversarial noise in the shadowed regions becomes visually less perceptible while permitting a greater tolerance for perturbations in non-shadow regions. The proposed shadow-adaptive attacks naturally align with the varying illumination distribution in shadow images, resulting in perturbations that are less conspicuous. Building on this, we conduct a comprehensive empirical evaluation of existing shadow removal methods, subjecting them to various levels of attack on publicly available datasets.
去影射是一个旨在消除图像中存在的区域影子,并重新创建具有一致照明且视觉上愉悦的自然场景的任务。虽然最近深度学习技术在图像去影射方面取得了令人印象深刻的性能,但它们对于对抗性攻击的鲁棒性仍然没有得到充分的探索。此外,许多现有的攻击框架通常将整个输入图像的扰动分配均匀的预算,这可能不适合攻击影射图像。这主要是因为影射图像中空间变化照明的独特特征。在本文中,我们提出了一种新颖的方法,称为影射自适应对抗攻击。与标准攻击不同,我们的攻击预算基于不同区域影射图像的像素强度进行调整。因此,在影射区域的优化攻击噪声在视觉上变得不那么显眼,同时允许非影射区域更大的扰动。所提出的影射自适应攻击自然与影射图像中随机的照明分布相一致,导致更少的扰动,且更加不显眼。在此基础上,我们对现有的影射去除方法进行了全面的实证评估,将它们提交到各种公开可用的数据集上进行攻击。
https://arxiv.org/abs/2403.10076
The standard approach to tackling computer vision problems is to train deep convolutional neural network (CNN) models using large-scale image datasets which are representative of the target task. However, in many scenarios, it is often challenging to obtain sufficient image data for the target task. Data augmentation is a way to mitigate this challenge. A common practice is to explicitly transform existing images in desired ways so as to create the required volume and variability of training data necessary to achieve good generalization performance. In situations where data for the target domain is not accessible, a viable workaround is to synthesize training data from scratch--i.e., synthetic data augmentation. This paper presents an extensive review of synthetic data augmentation techniques. It covers data synthesis approaches based on realistic 3D graphics modeling, neural style transfer (NST), differential neural rendering, and generative artificial intelligence (AI) techniques such as generative adversarial networks (GANs) and variational autoencoders (VAEs). For each of these classes of methods, we focus on the important data generation and augmentation techniques, general scope of application and specific use-cases, as well as existing limitations and possible workarounds. Additionally, we provide a summary of common synthetic datasets for training computer vision models, highlighting the main features, application domains and supported tasks. Finally, we discuss the effectiveness of synthetic data augmentation methods. Since this is the first paper to explore synthetic data augmentation methods in great detail, we are hoping to equip readers with the necessary background information and in-depth knowledge of existing methods and their attendant issues.
解决计算机视觉问题的标准方法是使用大型图像数据集训练深度卷积神经网络(CNN)模型,这些数据集代表目标任务。然而,在许多情况下,获得足够的目标任务图像数据具有挑战性。数据增强是一种减轻这一挑战的方法。一种常见的做法是对现有的图像进行显式转换,以便创建实现良好泛化性能所需的训练数据量。在目标领域数据不可访问的情况下,一个可行的解决方法是从零开始合成训练数据,即合成数据增强。 本文对合成数据增强技术进行了全面的回顾。它涵盖了基于现实3D图形建模的数据生成方法、神经风格迁移(NST)、差分神经渲染和生成人工智能(AI)技术(如生成对抗网络(GANs)和变分自编码器(VAEs)的数据生成方法。对于每种方法,我们重点关注重要的数据生成和增强技术、应用范围和具体用例,以及现有的局限性和可能的解决方案。此外,我们还提供了用于训练计算机视觉模型的常见合成数据集的总结,突出了主要特点、应用领域和支持任务。最后,我们讨论了合成数据增强方法的有效性。由于这是对详细探索合成数据增强方法的第一篇论文,我们希望能够为读者提供必要的背景信息和现有方法的深入知识及其相关问题。
https://arxiv.org/abs/2403.10075
Deep neural networks are vulnerable to adversarial attacks, often leading to erroneous outputs. Adversarial training has been recognized as one of the most effective methods to counter such attacks. However, existing adversarial training techniques have predominantly been tested on balanced datasets, whereas real-world data often exhibit a long-tailed distribution, casting doubt on the efficacy of these methods in practical scenarios. In this paper, we delve into adversarial training under long-tailed distributions. Through an analysis of the previous work "RoBal", we discover that utilizing Balanced Softmax Loss alone can achieve performance comparable to the complete RoBal approach while significantly reducing training overheads. Additionally, we reveal that, similar to uniform distributions, adversarial training under long-tailed distributions also suffers from robust overfitting. To address this, we explore data augmentation as a solution and unexpectedly discover that, unlike results obtained with balanced data, data augmentation not only effectively alleviates robust overfitting but also significantly improves robustness. We further investigate the reasons behind the improvement of robustness through data augmentation and identify that it is attributable to the increased diversity of examples. Extensive experiments further corroborate that data augmentation alone can significantly improve robustness. Finally, building on these findings, we demonstrate that compared to RoBal, the combination of BSL and data augmentation leads to a +6.66% improvement in model robustness under AutoAttack on CIFAR-10-LT. Our code is available at this https URL .
深度神经网络容易受到对抗攻击,通常导致错误的输出。已认识到,对抗训练是抵抗这类攻击的最有效的技术之一。然而,现有的对抗训练方法主要针对平衡数据集进行测试,而现实世界数据通常表现出长尾分布,这使得这些方法在实际场景中的有效性受到怀疑。在本文中,我们深入研究了在长尾分布下进行对抗训练。通过分析之前的工作"RoBal",我们发现,仅使用平衡软max损失即可在性能上与完整的RoBal方法相媲美,而显著降低训练开销。此外,我们还发现,与均匀分布类似,在长尾分布下进行对抗训练也会导致鲁棒过拟合。为了应对这一问题,我们探讨了数据增强作为解决方案,并意外地发现,与平衡数据相比,数据增强不仅有效地减轻了鲁棒过拟合,而且显著提高了鲁棒性。我们进一步研究了数据增强如何改善鲁棒性,并确定这归功于样本分布的增加。大量实验进一步证实了数据增强单独可以显著提高鲁棒性。最后,我们基于这些发现,证明了与RoBal相比,使用BSL和数据增强可以使模型在AutoAttack攻击下的CIFAR-10-LT上获得+6.66%的提高。我们的代码可在此处访问:https://www.x哥AI.com/ 。
https://arxiv.org/abs/2403.10073
This paper introduces unified projection-free Frank-Wolfe type algorithms for adversarial continuous DR-submodular optimization, spanning scenarios such as full information and (semi-)bandit feedback, monotone and non-monotone functions, different constraints, and types of stochastic queries. For every problem considered in the non-monotone setting, the proposed algorithms are either the first with proven sub-linear $\alpha$-regret bounds or have better $\alpha$-regret bounds than the state of the art, where $\alpha$ is a corresponding approximation bound in the offline setting. In the monotone setting, the proposed approach gives state-of-the-art sub-linear $\alpha$-regret bounds among projection-free algorithms in 7 of the 8 considered cases while matching the result of the remaining case. Additionally, this paper addresses semi-bandit and bandit feedback for adversarial DR-submodular optimization, advancing the understanding of this optimization area.
本文介绍了一种统一的无需投影的弗兰克-沃尔夫类型算法,用于对抗连续DR-子模块优化,包括完全信息以及(半)带反馈,单调函数,不同的约束,以及各种随机查询类型。对于非单调设置中考虑的所有问题,所提出的算法要么是已证明的亚线性$\alpha$后悔上限,要么具有比现有技术更好的$\alpha$后悔上限,其中$\alpha$是在离线设置中的相应近似上限。在单调设置中,与投影无关的算法在所考虑的8个案例中的7个案例中提供了最先进的亚线性$\alpha$后悔上限,同时与剩下的案例的结果相匹配。此外,本文还讨论了半带宽和带宽反馈对于对抗DR-子模块优化的理解,推动了该优化领域的理解。
https://arxiv.org/abs/2403.10063
Dataset distillation (DD) allows datasets to be distilled to fractions of their original size while preserving the rich distributional information so that models trained on the distilled datasets can achieve a comparable accuracy while saving significant computational loads. Recent research in this area has been focusing on improving the accuracy of models trained on distilled datasets. In this paper, we aim to explore a new perspective of DD. We study how to embed adversarial robustness in distilled datasets, so that models trained on these datasets maintain the high accuracy and meanwhile acquire better adversarial robustness. We propose a new method that achieves this goal by incorporating curvature regularization into the distillation process with much less computational overhead than standard adversarial training. Extensive empirical experiments suggest that our method not only outperforms standard adversarial training on both accuracy and robustness with less computation overhead but is also capable of generating robust distilled datasets that can withstand various adversarial attacks.
数据集萃取(DD)允许数据集以原始大小的分数进行萃取,同时保留丰富的分布信息,使得在萃取数据集上训练的模型可以在保留显著计算负载的同时实现与原始数据集相似的准确度。 近年来,该领域的研究一直在关注训练在萃取数据集上的模型的准确性。 在本文中,我们旨在探讨DD的新视角。我们研究了如何将对抗鲁棒性嵌入到萃取数据集中,使得训练在这些数据集上的模型具有高准确度和更好的对抗鲁棒性。我们提出了一种通过在萃取过程中引入凸性正则化来实现这一目标的新方法,该方法具有比标准对抗训练更少的计算开销,但能显著提高准确度和对抗鲁棒性。 大量的实证实验表明,我们的方法不仅在计算开销较小的情况下超越了标准对抗训练,而且能够生成具有各种对抗攻击能力的稳健萃取数据集。
https://arxiv.org/abs/2403.10045
Federated Reinforcement Learning (FRL) allows multiple agents to collaboratively build a decision making policy without sharing raw trajectories. However, if a small fraction of these agents are adversarial, it can lead to catastrophic results. We propose a policy gradient based approach that is robust to adversarial agents which can send arbitrary values to the server. Under this setting, our results form the first global convergence guarantees with general parametrization. These results demonstrate resilience with adversaries, while achieving sample complexity of order $\tilde{\mathcal{O}}\left( \frac{1}{\epsilon^2} \left( \frac{1}{N-f} + \frac{f^2}{(N-f)^2}\right)\right)$, where $N$ is the total number of agents and $f$ is the number of adversarial agents.
联邦强化学习(FRL)允许多个智能体在不共享原始轨迹的情况下协同构建决策策略。然而,如果一小部分智能体是敌对的,可能会导致灾难性的结果。我们提出了一种基于策略梯度的方法,该方法对敌对智能体具有鲁棒性,并且可以发送任意值给服务器。在這種设置下,我们的结果形成了与泛化参数有关的第一个全局收敛保证。这些结果表明了在与敌手合作时具有韧性,同时实现了样本复杂度为 $\mathcal{O}\left( \frac{1}{\epsilon^2} \left( \frac{1}{N-f} + \frac{f^2}{(N-f)^2}\right)\right)$ 的目标,其中 $N$ 是智能体总数和敌手数量。
https://arxiv.org/abs/2403.09940
Domain adaptation methods for object detection (OD) strive to mitigate the impact of distribution shifts by promoting feature alignment across source and target domains. Multi-source domain adaptation (MSDA) allows leveraging multiple annotated source datasets, and unlabeled target data to improve the accuracy and robustness of the detection model. Most state-of-the-art MSDA methods for OD perform feature alignment in a class-agnostic manner. This is challenging since the objects have unique modal information due to variations in object appearance across domains. A recent prototype-based approach proposed a class-wise alignment, yet it suffers from error accumulation due to noisy pseudo-labels which can negatively affect adaptation with imbalanced data. To overcome these limitations, we propose an attention-based class-conditioned alignment scheme for MSDA that aligns instances of each object category across domains. In particular, an attention module coupled with an adversarial domain classifier allows learning domain-invariant and class-specific instance representations. Experimental results on multiple benchmarking MSDA datasets indicate that our method outperforms the state-of-the-art methods and is robust to class imbalance. Our code is available at this https URL.
为了减轻分布不均衡的影响,域自适应方法(OD)的目标是通过促进源域和目标域之间特征的对齐来缓解影响。多源域自适应(MSDA)允许利用多个注释有源数据集和未标注的目标数据,以提高检测模型的准确性和稳健性。目前最先进的MSDA方法对OD执行分类无关的特征对齐。由于不同域中对象外观的差异,这种对齐方法具有挑战性。最近,一个基于原型的方法提出了分类级别的对齐,但它由于噪声伪标签的累积误差而受到负面影响,这可能影响自适应。为了克服这些限制,我们提出了一个关注基的分类条件对齐方案来进行MSDA,它将每个对象类别的实例在域之间对齐。特别地,一个与对抗域分类器耦合的注意模块允许学习域无关的和类特异性的实例表示。在多个基准MSDA数据集上的实验结果表明,我们的方法超越了最先进的方法,并且对类不平衡很敏感。我们的代码可在此处访问:https:// this URL.
https://arxiv.org/abs/2403.09918
This paper introduces semantic features as a general conceptual framework for fully explainable neural network layers. A well-motivated proof of concept model for relevant subproblem of MNIST consists of 4 such layers with the total of 4.8K learnable parameters. The model is easily interpretable, achieves human-level adversarial test accuracy with no form of adversarial training, requires little hyperparameter tuning and can be quickly trained on a single CPU. The general nature of the technique bears promise for a paradigm shift towards radically democratised and truly generalizable white box neural networks. The code is available at this https URL
本文介绍了一种作为完全可解释神经网络层的一般概念框架的语义特征。一个动机强烈的证明概念模型,用于解释与MNIST相关子问题,包括4层,具有总4.8K可学习参数。该模型易于解释,在没有任何形式的对抗训练的情况下,实现了与人类水平相当的对抗测试准确性。 hyperparameter无需太多调整,可以在单个CPU上快速训练。这种技术的一般性质为彻底民主化和真正通用白皮书神经网络范式带来了希望。代码可在此链接处获取:
https://arxiv.org/abs/2403.09863
Different from traditional task-specific vision models, recent large VLMs can readily adapt to different vision tasks by simply using different textual instructions, i.e., prompts. However, a well-known concern about traditional task-specific vision models is that they can be misled by imperceptible adversarial perturbations. Furthermore, the concern is exacerbated by the phenomenon that the same adversarial perturbations can fool different task-specific models. Given that VLMs rely on prompts to adapt to different tasks, an intriguing question emerges: Can a single adversarial image mislead all predictions of VLMs when a thousand different prompts are given? This question essentially introduces a novel perspective on adversarial transferability: cross-prompt adversarial transferability. In this work, we propose the Cross-Prompt Attack (CroPA). This proposed method updates the visual adversarial perturbation with learnable prompts, which are designed to counteract the misleading effects of the adversarial image. By doing this, CroPA significantly improves the transferability of adversarial examples across prompts. Extensive experiments are conducted to verify the strong cross-prompt adversarial transferability of CroPA with prevalent VLMs including Flamingo, BLIP-2, and InstructBLIP in various different tasks. Our source code is available at \url{this https URL}.
不同于传统任务特定视觉模型,最近的大型VLM可以通过简单地使用不同的文本指令(提示)来适应不同的视觉任务。然而,传统任务特定视觉模型的一个已知担忧是,它们可能被无感知地对抗性扰动所误导。此外,这个担忧加剧了同一个对抗性扰动可以欺骗不同任务特定模型的现象。鉴于VLMs通过提示适应不同任务,一个有趣的问题出现了:当给定一千个不同的提示时,一个对抗性图像是否可以误导所有VLM的预测?这个问题基本上引入了一个新的对抗转移性视角:跨提示的对抗转移性。在这项工作中,我们提出了Cross-Prompt Attack(CroPA)。这个所提出的方法通过可学习的提示更新视觉对抗扰动,这些提示旨在抵消对抗性图像的误导性效果。通过这样做,CroPA显著提高了跨提示的对抗示例的转移性。在各种不同任务中,我们进行了广泛的实验来验证CroPA与普遍的VLM(包括Flamingo、BLIP-2和InstructBLIP)之间的跨提示对抗转移性的强度。我们的源代码可在此处访问:\url{这个链接}。
https://arxiv.org/abs/2403.09766
In this paper, we explore the unique modality of sketch for explainability, emphasising the profound impact of human strokes compared to conventional pixel-oriented studies. Beyond explanations of network behavior, we discern the genuine implications of explainability across diverse downstream sketch-related tasks. We propose a lightweight and portable explainability solution -- a seamless plugin that integrates effortlessly with any pre-trained model, eliminating the need for re-training. Demonstrating its adaptability, we present four applications: highly studied retrieval and generation, and completely novel assisted drawing and sketch adversarial attacks. The centrepiece to our solution is a stroke-level attribution map that takes different forms when linked with downstream tasks. By addressing the inherent non-differentiability of rasterisation, we enable explanations at both coarse stroke level (SLA) and partial stroke level (P-SLA), each with its advantages for specific downstream tasks.
在本文中,我们探讨了可解释性绘图的独特维度,强调人类笔触与传统像素导向研究的深刻影响。除了网络行为的解释外,我们分辨出可解释性在各种下游绘图相关任务中的真正含义。我们提出了一个轻量级且便携的 explainability 解决方案--无缝插件,可轻松地集成到任何预训练模型中,无需重新训练。展示其适应性,我们提出了四个应用:高度研究过的检索和生成,以及完全新颖的辅助绘图和绘图对抗攻击。我们解决方案的核心是一个在连接到下游任务时具有不同形式的笔触级别归因图。通过解决平滑映射固有的不可分性,我们使得解释在粗笔级别(SLA)和部分笔级别(P-SLA)上都具有优势, each with its advantages for specific downstream tasks.
https://arxiv.org/abs/2403.09480
Large Vision-Language Models (LVLMs) have shown significant progress in well responding to visual-instructions from users. However, these instructions, encompassing images and text, are susceptible to both intentional and inadvertent attacks. Despite the critical importance of LVLMs' robustness against such threats, current research in this area remains limited. To bridge this gap, we introduce AVIBench, a framework designed to analyze the robustness of LVLMs when facing various adversarial visual-instructions (AVIs), including four types of image-based AVIs, ten types of text-based AVIs, and nine types of content bias AVIs (such as gender, violence, cultural, and racial biases, among others). We generate 260K AVIs encompassing five categories of multimodal capabilities (nine tasks) and content bias. We then conduct a comprehensive evaluation involving 14 open-source LVLMs to assess their performance. AVIBench also serves as a convenient tool for practitioners to evaluate the robustness of LVLMs against AVIs. Our findings and extensive experimental results shed light on the vulnerabilities of LVLMs, and highlight that inherent biases exist even in advanced closed-source LVLMs like GeminiProVision and GPT-4V. This underscores the importance of enhancing the robustness, security, and fairness of LVLMs. The source code and benchmark will be made publicly available.
大视图语言模型(LVLMs)在用户回应视觉指令方面取得了显著进展。然而,这些指令包括图像和文本,容易受到有意和无意攻击的影响。尽管LVLMs对这种威胁的鲁棒性至关重要,但该领域的现有研究仍然有限。为了填补这一空白,我们引入了AVIBench,一个旨在分析LVLMs在面临各种攻击性视觉指令(AVIs)时的鲁棒性(包括四种图像基AVI,十种文本基AVI和九种内容偏见AVI等)的框架。我们生成了包括五个多模态能力(九个任务)和内容偏见在内的260K个AVI。然后,我们对14个开源的LVLM进行了全面评估,以评估其性能。AVIBench还成为实践者评估LVLMs对抗AVI的鲁棒性的便利工具。我们的发现和广泛的实验结果揭示了LVLMs的漏洞,并强调即使在先进的闭源LVLM(如GeminiProVision和GPT-4V)中,固有偏见仍然存在。这突出了增强LVLMs的鲁棒性、安全性和公平性的重要性。源代码和基准将公开发布。
https://arxiv.org/abs/2403.09346
Scene-Text Visual Question Answering (ST-VQA) aims to understand scene text in images and answer questions related to the text content. Most existing methods heavily rely on the accuracy of Optical Character Recognition (OCR) systems, and aggressive fine-tuning based on limited spatial location information and erroneous OCR text information often leads to inevitable overfitting. In this paper, we propose a multimodal adversarial training architecture with spatial awareness capabilities. Specifically, we introduce an Adversarial OCR Enhancement (AOE) module, which leverages adversarial training in the embedding space of OCR modality to enhance fault-tolerant representation of OCR texts, thereby reducing noise caused by OCR errors. Simultaneously, We add a Spatial-Aware Self-Attention (SASA) mechanism to help the model better capture the spatial relationships among OCR tokens. Various experiments demonstrate that our method achieves significant performance improvements on both the ST-VQA and TextVQA datasets and provides a novel paradigm for multimodal adversarial training.
场景文本视觉问答(ST-VQA)旨在理解图像中的场景文本,并回答与文本内容相关的问题。现有方法很大程度上依赖于光学字符识别(OCR)系统的准确性,而且基于有限的空间位置信息和错误的OCR文本信息进行激进的微调往往会导致不可预测的过拟合。在本文中,我们提出了一个具有空间感知能力的多模态对抗训练架构。具体来说,我们引入了一个对抗性OCR增强(AOE)模块,它利用OCR模态的嵌入空间中的对抗性训练来增强OCR文本的容错表示,从而减少由OCR错误引起的噪声。同时,我们还添加了一个空间感知自注意力(SASA)机制,帮助模型更好地捕捉OCR词汇之间的空间关系。各种实验结果表明,我们的方法在ST-VQA和TextVQA数据集上都取得了显著的性能提升,并为多模态对抗训练树立了新的范例。
https://arxiv.org/abs/2403.09288
Although Graph Neural Networks (GNNs) have exhibited the powerful ability to gather graph-structured information from neighborhood nodes via various message-passing mechanisms, the performance of GNNs is limited by poor generalization and fragile robustness caused by noisy and redundant graph data. As a prominent solution, Graph Augmentation Learning (GAL) has recently received increasing attention. Among prior GAL approaches, edge-dropping methods that randomly remove edges from a graph during training are effective techniques to improve the robustness of GNNs. However, randomly dropping edges often results in bypassing critical edges, consequently weakening the effectiveness of message passing. In this paper, we propose a novel adversarial edge-dropping method (ADEdgeDrop) that leverages an adversarial edge predictor guiding the removal of edges, which can be flexibly incorporated into diverse GNN backbones. Employing an adversarial training framework, the edge predictor utilizes the line graph transformed from the original graph to estimate the edges to be dropped, which improves the interpretability of the edge-dropping method. The proposed ADEdgeDrop is optimized alternately by stochastic gradient descent and projected gradient descent. Comprehensive experiments on six graph benchmark datasets demonstrate that the proposed ADEdgeDrop outperforms state-of-the-art baselines across various GNN backbones, demonstrating improved generalization and robustness.
尽管图神经网络(GNNs)通过各种消息传递机制表现出从邻居节点收集图形结构信息的力量,但GNNs的性能受到噪声和冗余图形数据导致的泛化能力和脆弱性的限制。作为突出的解决方案,Graph Augmentation Learning (GAL) 最近受到了越来越多的关注。在先前的GAL方法中,训练期间随机从图中删除边是一种有效的改进GNNs robust性的技术。然而,随机删除边通常会导致绕过关键边,从而削弱了消息传递的有效性。在本文中,我们提出了一种新颖的 adversarial edge-dropping 方法(ADEdgeDrop),它利用 adversarial edge predictor 指导删除边,可以灵活地集成到各种 GNN 骨干网络中。采用 adversarial 训练框架,adversarial edge predictor 使用从原始图中转换的线图估计要删除的边,这改善了边缘删除方法的解释性。所提出的 ADEdgeDrop 通过交替使用随机梯度下降和投影梯度下降进行优化。对六个图形基准数据集的全面实验证明,与最先进的基线相比,所提出的 ADEdgeDrop 在各种 GNN 骨干网络上都表现出卓越的性能,证明了改进的泛化能力和脆弱性。
https://arxiv.org/abs/2403.09171
Adversarial training (AT) is currently one of the most effective ways to obtain the robustness of deep neural networks against adversarial attacks. However, most AT methods suffer from robust overfitting, i.e., a significant generalization gap in adversarial robustness between the training and testing curves. In this paper, we first identify a connection between robust overfitting and the excessive memorization of noisy labels in AT from a view of gradient norm. As such label noise is mainly caused by a distribution mismatch and improper label assignments, we are motivated to propose a label refinement approach for AT. Specifically, our Self-Guided Label Refinement first self-refines a more accurate and informative label distribution from over-confident hard labels, and then it calibrates the training by dynamically incorporating knowledge from self-distilled models into the current model and thus requiring no external teachers. Empirical results demonstrate that our method can simultaneously boost the standard accuracy and robust performance across multiple benchmark datasets, attack types, and architectures. In addition, we also provide a set of analyses from the perspectives of information theory to dive into our method and suggest the importance of soft labels for robust generalization.
对抗性训练(AT)是获取深度神经网络对抗对抗性攻击 robustness 的最有效方法之一。然而,大多数 AT 方法都受到过拟合的影响,即在训练和测试曲线之间,对抗性鲁棒性之间的显著泛化差距。在本文中,我们首先从梯度范式的角度确定了一个连接 between 对抗性过拟合和 AT 中噪音标签的过度集中。正如噪音标签主要由分布不匹配和错误的标签分配引起,我们受到激励提出一个 AT 标签修复方法。具体来说,我们的自引导标签修复首先从过度自信的硬标签中自我修复更准确和有信息量的标签分布,然后通过动态整合来自自放飞的模型的知识来校准训练,从而无需额外教师。实验结果表明,我们的方法可以同时提高多个基准数据集、攻击类型和架构的标准准确性和鲁棒性能。此外,我们还从信息论的角度提供了一些分析,深入探讨我们的方法,并建议软标签在鲁棒性扩展中具有重要性。
https://arxiv.org/abs/2403.09101
Label corruption, where training samples have incorrect labels, can significantly degrade the performance of machine learning models. This corruption often arises from non-expert labeling or adversarial attacks. Acquiring large, perfectly labeled datasets is costly, and retraining large models from scratch when a clean dataset becomes available is computationally expensive. To address this challenge, we propose Post-Training Correction, a new paradigm that adjusts model parameters after initial training to mitigate label noise, eliminating the need for retraining. We introduce Verifix, a novel Singular Value Decomposition (SVD) based algorithm that leverages a small, verified dataset to correct the model weights using a single update. Verifix uses SVD to estimate a Clean Activation Space and then projects the model's weights onto this space to suppress activations corresponding to corrupted data. We demonstrate Verifix's effectiveness on both synthetic and real-world label noise. Experiments on the CIFAR dataset with 25% synthetic corruption show 7.36% generalization improvements on average. Additionally, we observe generalization improvements of up to 2.63% on naturally corrupted datasets like WebVision1.0 and Clothing1M.
标签污染是一个严重的问题,训练样本具有错误的标签会显著降低机器学习模型的性能。这种污染通常源于非专家标注或对抗攻击。获取大量完美的标注数据集代价昂贵,当干净的数据集变得可用时,从零开始重新训练大型模型既耗时又昂贵。为解决这个问题,我们提出了Post-Training Correction,一种新的范式,在初始训练后调整模型参数以减轻标签噪声,无需重新训练。我们引入了Verifix,一种基于小、经过验证的数据集的奇异值分解(SVD)算法,利用该算法通过一次更新来纠正模型权重。Verifix使用SVD估计干净激活空间,然后将模型的权重投影到这个空间中,抑制与污染数据相对应的激活。我们在CIFAR数据集上进行实验,使用25%的合成污染数据,平均实现了7.36%的泛化改进。此外,我们还观察到自然污染数据集(如WebVision1.0和Clothing1M)上的泛化改进达到2.63%。
https://arxiv.org/abs/2403.08618
Graph neural networks (GNNs) are widely utilized to capture the information spreading patterns in graphs. While remarkable performance has been achieved, there is a new trending topic of evaluating node influence. We propose a new method of evaluating node influence, which measures the prediction change of a trained GNN model caused by removing a node. A real-world application is, "In the task of predicting Twitter accounts' polarity, had a particular account been removed, how would others' polarity change?". We use the GNN as a surrogate model whose prediction could simulate the change of nodes or edges caused by node removal. To obtain the influence for every node, a straightforward way is to alternately remove every node and apply the trained GNN on the modified graph. It is reliable but time-consuming, so we need an efficient method. The related lines of work, such as graph adversarial attack and counterfactual explanation, cannot directly satisfy our needs, since they do not focus on the global influence score for every node. We propose an efficient and intuitive method, NOde-Removal-based fAst GNN inference (NORA), which uses the gradient to approximate the node-removal influence. It only costs one forward propagation and one backpropagation to approximate the influence score for all nodes. Extensive experiments on six datasets and six GNN models verify the effectiveness of NORA. Our code is available at this https URL.
图神经网络(GNNs)广泛用于捕捉图中的信息传播模式。虽然已经取得了显著的性能,但一个新的趋势是评估节点影响力。我们提出了一种评估节点影响力的新方法,该方法测量了移除节点后训练GNN模型预测的变化。一个现实世界的应用是,在预测Twitter账户的极性时,如果有一个特定的账户被移除,其他人的极性会发生什么变化?我们将GNN视为一个代理模型,其预测可以模拟节点移除造成的节点或边的影响变化。要获得每个节点的影响力,一个简单的方法是交替删除每个节点并应用训练后的GNN。这是可靠但时间充裕的,所以我们需要一个高效的方法。相关研究线,如图对抗攻击和反事实解释,不能直接满足我们的需求,因为它们不关注每个节点的全局影响力评分。我们提出了一种高效且直观的方法,NORA-基于梯度的节点移除GNN推理(NORA),它使用梯度来近似节点移除的影响。它只需要一次前向传播和一次反向传播来近似所有节点的影响力分数。在六个数据集和六个GNN模型上的大量实验证实了NORA的有效性。我们的代码可以从该链接下载。
https://arxiv.org/abs/2403.08333