Abstract
The standard approach to tackling computer vision problems is to train deep convolutional neural network (CNN) models using large-scale image datasets which are representative of the target task. However, in many scenarios, it is often challenging to obtain sufficient image data for the target task. Data augmentation is a way to mitigate this challenge. A common practice is to explicitly transform existing images in desired ways so as to create the required volume and variability of training data necessary to achieve good generalization performance. In situations where data for the target domain is not accessible, a viable workaround is to synthesize training data from scratch--i.e., synthetic data augmentation. This paper presents an extensive review of synthetic data augmentation techniques. It covers data synthesis approaches based on realistic 3D graphics modeling, neural style transfer (NST), differential neural rendering, and generative artificial intelligence (AI) techniques such as generative adversarial networks (GANs) and variational autoencoders (VAEs). For each of these classes of methods, we focus on the important data generation and augmentation techniques, general scope of application and specific use-cases, as well as existing limitations and possible workarounds. Additionally, we provide a summary of common synthetic datasets for training computer vision models, highlighting the main features, application domains and supported tasks. Finally, we discuss the effectiveness of synthetic data augmentation methods. Since this is the first paper to explore synthetic data augmentation methods in great detail, we are hoping to equip readers with the necessary background information and in-depth knowledge of existing methods and their attendant issues.
Abstract (translated)
解决计算机视觉问题的标准方法是使用大型图像数据集训练深度卷积神经网络(CNN)模型,这些数据集代表目标任务。然而,在许多情况下,获得足够的目标任务图像数据具有挑战性。数据增强是一种减轻这一挑战的方法。一种常见的做法是对现有的图像进行显式转换,以便创建实现良好泛化性能所需的训练数据量。在目标领域数据不可访问的情况下,一个可行的解决方法是从零开始合成训练数据,即合成数据增强。 本文对合成数据增强技术进行了全面的回顾。它涵盖了基于现实3D图形建模的数据生成方法、神经风格迁移(NST)、差分神经渲染和生成人工智能(AI)技术(如生成对抗网络(GANs)和变分自编码器(VAEs)的数据生成方法。对于每种方法,我们重点关注重要的数据生成和增强技术、应用范围和具体用例,以及现有的局限性和可能的解决方案。此外,我们还提供了用于训练计算机视觉模型的常见合成数据集的总结,突出了主要特点、应用领域和支持任务。最后,我们讨论了合成数据增强方法的有效性。由于这是对详细探索合成数据增强方法的第一篇论文,我们希望能够为读者提供必要的背景信息和现有方法的深入知识及其相关问题。
URL
https://arxiv.org/abs/2403.10075