Paper Reading AI Learner

SDFD: Building a Versatile Synthetic Face Image Dataset with Diverse Attributes

2024-04-26 08:51:31
Georgia Baltsou, Ioannis Sarridis, Christos Koutlis, Symeon Papadopoulos

Abstract

AI systems rely on extensive training on large datasets to address various tasks. However, image-based systems, particularly those used for demographic attribute prediction, face significant challenges. Many current face image datasets primarily focus on demographic factors such as age, gender, and skin tone, overlooking other crucial facial attributes like hairstyle and accessories. This narrow focus limits the diversity of the data and consequently the robustness of AI systems trained on them. This work aims to address this limitation by proposing a methodology for generating synthetic face image datasets that capture a broader spectrum of facial diversity. Specifically, our approach integrates a systematic prompt formulation strategy, encompassing not only demographics and biometrics but also non-permanent traits like make-up, hairstyle, and accessories. These prompts guide a state-of-the-art text-to-image model in generating a comprehensive dataset of high-quality realistic images and can be used as an evaluation set in face analysis systems. Compared to existing datasets, our proposed dataset proves equally or more challenging in image classification tasks while being much smaller in size.

Abstract (translated)

翻译:AI系统通过在大数据集上进行广泛的训练来解决各种任务,但基于图像的系统,尤其是用于人口属性预测的系统,面临着显著的挑战。许多当前的人脸图像数据集主要关注人口因素,如年龄、性别和肤色,而忽略了其他关键的面部特征,如发型和饰品。这种狭窄的聚焦限制了数据的多样性,从而降低了训练在它们上的AI系统的稳健性。这项工作旨在通过提出一种生成合成面部图像数据的方法来解决这一限制,该方法涵盖了更广泛的面部特征,包括 demographic 和生物特征,以及 non-permanent 特征如化妆、发型和饰品。这些提示指导了最先进的文本转图像模型生成全面的高质量人脸图像数据集,可以作为面部分析系统中的评估集。与现有数据集相比,我们提出的数据集在图像分类任务上同样具有挑战性,尽管在规模上更小。

URL

https://arxiv.org/abs/2404.17255

PDF

https://arxiv.org/pdf/2404.17255.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot