Abstract
AI systems rely on extensive training on large datasets to address various tasks. However, image-based systems, particularly those used for demographic attribute prediction, face significant challenges. Many current face image datasets primarily focus on demographic factors such as age, gender, and skin tone, overlooking other crucial facial attributes like hairstyle and accessories. This narrow focus limits the diversity of the data and consequently the robustness of AI systems trained on them. This work aims to address this limitation by proposing a methodology for generating synthetic face image datasets that capture a broader spectrum of facial diversity. Specifically, our approach integrates a systematic prompt formulation strategy, encompassing not only demographics and biometrics but also non-permanent traits like make-up, hairstyle, and accessories. These prompts guide a state-of-the-art text-to-image model in generating a comprehensive dataset of high-quality realistic images and can be used as an evaluation set in face analysis systems. Compared to existing datasets, our proposed dataset proves equally or more challenging in image classification tasks while being much smaller in size.
Abstract (translated)
翻译:AI系统通过在大数据集上进行广泛的训练来解决各种任务,但基于图像的系统,尤其是用于人口属性预测的系统,面临着显著的挑战。许多当前的人脸图像数据集主要关注人口因素,如年龄、性别和肤色,而忽略了其他关键的面部特征,如发型和饰品。这种狭窄的聚焦限制了数据的多样性,从而降低了训练在它们上的AI系统的稳健性。这项工作旨在通过提出一种生成合成面部图像数据的方法来解决这一限制,该方法涵盖了更广泛的面部特征,包括 demographic 和生物特征,以及 non-permanent 特征如化妆、发型和饰品。这些提示指导了最先进的文本转图像模型生成全面的高质量人脸图像数据集,可以作为面部分析系统中的评估集。与现有数据集相比,我们提出的数据集在图像分类任务上同样具有挑战性,尽管在规模上更小。
URL
https://arxiv.org/abs/2404.17255