Abstract
We investigate the impact of deep generative models on potential social biases in upcoming computer vision models. As the internet witnesses an increasing influx of AI-generated images, concerns arise regarding inherent biases that may accompany them, potentially leading to the dissemination of harmful content. This paper explores whether a detrimental feedback loop, resulting in bias amplification, would occur if generated images were used as the training data for future models. We conduct simulations by progressively substituting original images in COCO and CC3M datasets with images generated through Stable Diffusion. The modified datasets are used to train OpenCLIP and image captioning models, which we evaluate in terms of quality and bias. Contrary to expectations, our findings indicate that introducing generated images during training does not uniformly amplify bias. Instead, instances of bias mitigation across specific tasks are observed. We further explore the factors that may influence these phenomena, such as artifacts in image generation (e.g., blurry faces) or pre-existing biases in the original datasets.
Abstract (translated)
我们研究了深度生成模型对即将发布的计算机视觉模型中潜在社会偏见的影响。随着互联网越来越多地涌现出由AI生成的图像,人们开始关注可能伴随着它们固有的偏见,这可能导致有害内容的传播。本文探讨了,如果生成图像被用作未来模型的训练数据,是否会发生有害反馈循环,导致偏见放大。我们通过逐步用通过Stable Diffusion生成的图像替换COCO和CC3M数据集中的原始图像来进行模拟。修改后的数据集用于训练OpenCLIP和图像标题模型,我们对质量和偏见进行评估。与预期相反,我们的研究结果表明,在训练过程中引入生成图像并不均匀地放大偏见。相反,我们观察到特定任务上偏见的缓解实例。我们进一步研究了可能影响这些现象的因素,例如图像生成中的伪影(例如模糊的脸)或原始数据集中的预先存在的偏见。
URL
https://arxiv.org/abs/2404.03242