Paper Reading AI Learner

Effective and Robust Adversarial Training against Data and Label Corruptions

2024-05-07 10:53:20
Peng-Fei Zhang, Zi Huang, Xin-Shun Xu, Guangdong Bai

Abstract

Corruptions due to data perturbations and label noise are prevalent in the datasets from unreliable sources, which poses significant threats to model training. Despite existing efforts in developing robust models, current learning methods commonly overlook the possible co-existence of both corruptions, limiting the effectiveness and practicability of the model. In this paper, we develop an Effective and Robust Adversarial Training (ERAT) framework to simultaneously handle two types of corruption (i.e., data and label) without prior knowledge of their specifics. We propose a hybrid adversarial training surrounding multiple potential adversarial perturbations, alongside a semi-supervised learning based on class-rebalancing sample selection to enhance the resilience of the model for dual corruption. On the one hand, in the proposed adversarial training, the perturbation generation module learns multiple surrogate malicious data perturbations by taking a DNN model as the victim, while the model is trained to maintain semantic consistency between the original data and the hybrid perturbed data. It is expected to enable the model to cope with unpredictable perturbations in real-world data corruption. On the other hand, a class-rebalancing data selection strategy is designed to fairly differentiate clean labels from noisy labels. Semi-supervised learning is performed accordingly by discarding noisy labels. Extensive experiments demonstrate the superiority of the proposed ERAT framework.

Abstract (translated)

由于数据扰动和标签噪声导致的腐败在不可靠数据源的数据集中普遍存在,这会对模型训练产生重大威胁。尽管已经开发出了一些 robust 的模型,但目前的训练方法通常忽视了两种腐败(即数据和标签)可能同时存在的可能性,从而限制了模型的有效性和可操作性。在本文中,我们提出了一种有效的鲁棒对抗训练(ERAT)框架,以同时处理两种腐败(即数据和标签),而无需具体了解其情况。我们提出了一种基于多个潜在对抗扰动周围进行半监督学习的方法,以及一种基于类重新平衡样本选择来增强模型对双重腐败的鲁棒性的方法。一方面,在所提出的 ERAT 训练中,扰动生成模块通过将 DNN 模型作为受害者来学习多个代理恶意数据扰动,而模型通过保持原始数据和混合扰动数据的语义一致来训练。预计这将使模型能够应对现实世界数据腐败中的不可预测扰动。另一方面,为了公平地区分清洁标签和噪声标签,我们设计了一种类重新平衡数据选择策略。相应地进行半监督学习,通过丢弃噪声标签。大量实验证明,所提出的 ERAT 框架具有优越性。

URL

https://arxiv.org/abs/2405.04191

PDF

https://arxiv.org/pdf/2405.04191.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot