Paper Reading AI Learner

Instance-Conditioned GAN Data Augmentation for Representation Learning

2023-03-16 22:45:43
Pietro Astolfi, Arantxa Casanova, Jakob Verbeek, Pascal Vincent, Adriana Romero-Soriano, Michal Drozdzal

Abstract

Data augmentation has become a crucial component to train state-of-the-art visual representation models. However, handcrafting combinations of transformations that lead to improved performances is a laborious task, which can result in visually unrealistic samples. To overcome these limitations, recent works have explored the use of generative models as learnable data augmentation tools, showing promising results in narrow application domains, e.g., few-shot learning and low-data medical imaging. In this paper, we introduce a data augmentation module, called DA_IC-GAN, which leverages instance-conditioned GAN generations and can be used off-the-shelf in conjunction with most state-of-the-art training recipes. We showcase the benefits of DA_IC-GAN by plugging it out-of-the-box into the supervised training of ResNets and DeiT models on the ImageNet dataset, and achieving accuracy boosts up to between 1%p and 2%p with the highest capacity models. Moreover, the learnt representations are shown to be more robust than the baselines when transferred to a handful of out-of-distribution datasets, and exhibit increased invariance to variations of instance and viewpoints. We additionally couple DA_IC-GAN with a self-supervised training recipe and show that we can also achieve an improvement of 1%p in accuracy in some settings. With this work, we strengthen the evidence on the potential of learnable data augmentations to improve visual representation learning, paving the road towards non-handcrafted augmentations in model training.

Abstract (translated)

数据增强已经成为训练现代视觉表示模型的关键组成部分。然而,手工组合变换导致性能改善是一项艰苦的任务,可能会导致视觉效果不合理的样本。为了克服这些限制,最近的工作探索了生成模型作为可学习的数据增强工具的使用,在狭窄的应用 domains 内,例如单样本学习和小数据医学成像,取得了令人瞩目的结果。在本文中,我们介绍了 DA_IC-GAN 数据增强模块,它利用实例条件GAN生成器,可以与大多数先进的训练配方一起使用。我们展示 DA_IC-GAN 的优势,通过将其插入 ImageNet 数据集上的ResNet 和 DeiT模型的监督训练中,并将精度Boost到1%p至2%p的最高水平模型上。此外,当将其转移到少数非分布数据集时,学习的表示比基准更加鲁棒,并且具有增加对实例和观点变异的不变性。我们此外与自监督训练配方联用,并表明在某些设置下,我们也能提高1%p的精度。通过这项工作,我们加强了可学习数据增强改善视觉表示学习的潜力的证据,开创了模型训练中不使用手工增强的道路。

URL

https://arxiv.org/abs/2303.09677

PDF

https://arxiv.org/pdf/2303.09677


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot