Abstract
Synthetic data is gaining increasing relevance for training machine learning models. This is mainly motivated due to several factors such as the lack of real data and intra-class variability, time and errors produced in manual labeling, and in some cases privacy concerns, among others. This paper presents an overview of the 2nd edition of the Face Recognition Challenge in the Era of Synthetic Data (FRCSyn) organized at CVPR 2024. FRCSyn aims to investigate the use of synthetic data in face recognition to address current technological limitations, including data privacy concerns, demographic biases, generalization to novel scenarios, and performance constraints in challenging situations such as aging, pose variations, and occlusions. Unlike the 1st edition, in which synthetic data from DCFace and GANDiffFace methods was only allowed to train face recognition systems, in this 2nd edition we propose new sub-tasks that allow participants to explore novel face generative methods. The outcomes of the 2nd FRCSyn Challenge, along with the proposed experimental protocol and benchmarking contribute significantly to the application of synthetic data to face recognition.
Abstract (translated)
合成数据在训练机器学习模型中的重要性逐渐增加。这主要是因为缺少真实数据和类内变异性、手动标注中产生的时间和错误等因素。此外,在某些情况下还涉及到隐私问题等一些因素。本文概述了在CVPR 2024举办的第一届合成数据时代(FRCSyn)人脸识别挑战赛。FRCSyn旨在研究在合成数据时代使用合成数据解决当前技术限制,包括数据隐私问题、人口偏见、对新场景的泛化能力和挑战情况下的性能约束等。与第一版不同,第一版只允许来自DCFace和GANDiffFace方法的合成数据训练人脸识别系统。而第二版的人脸识别挑战赛提出了新的子任务,允许参赛者探索新颖的人脸生成方法。第二FRCSyn挑战赛的结果以及提出的实验协议和基准测试对将合成数据应用于人脸识别具有显著推动作用。
URL
https://arxiv.org/abs/2404.10378