Abstract
Medical image data are often limited due to the expensive acquisition and annotation process. Hence, training a deep-learning model with only raw data can easily lead to overfitting. One solution to this problem is to augment the raw data with various transformations, improving the model's ability to generalize to new data. However, manually configuring a generic augmentation combination and parameters for different datasets is non-trivial due to inconsistent acquisition approaches and data distributions. Therefore, automatic data augmentation is proposed to learn favorable augmentation strategies for different datasets while incurring large GPU overhead. To this end, we present a novel method, called Dynamic Data Augmentation (DDAug), which is efficient and has negligible computation cost. Our DDAug develops a hierarchical tree structure to represent various augmentations and utilizes an efficient Monte-Carlo tree searching algorithm to update, prune, and sample the tree. As a result, the augmentation pipeline can be optimized for each dataset automatically. Experiments on multiple Prostate MRI datasets show that our method outperforms the current state-of-the-art data augmentation strategies.
Abstract (translated)
医学图像数据通常由于昂贵的数据采集和标注过程而受到限制。因此,仅仅使用原始数据训练深度学习模型很容易会导致过拟合。解决这个问题的一种方法是通过添加各种变换来增加原始数据,提高模型对新数据泛化的能力。然而,由于不同数据集的不一致性采集方法和数据分布,手动配置通用的增强组合和参数非常困难。因此,我们提出了一种名为Dynamic Data Augmentation(DD Aug)的新方法,它高效且计算成本为零。我们的DD Aug开发了层级树结构来表示各种增强,并使用高效的蒙特卡罗树搜索算法来更新、修剪和采样树。因此,每个数据集都可以自动优化增强流程。对多个前列腺癌MRI数据集的实验表明,我们的方法比当前最先进的增强策略表现更好。
URL
https://arxiv.org/abs/2305.15777