Modern face recognition systems utilize deep neural networks to extract salient features from a face. These features denote embeddings in latent space and are often stored as templates in a face recognition system. These embeddings are susceptible to data leakage and, in some cases, can even be used to reconstruct the original face image. To prevent compromising identities, template protection schemes are commonly employed. However, these schemes may still not prevent the leakage of soft biometric information such as age, gender and race. To alleviate this issue, we propose a novel technique that combines Fully Homomorphic Encryption (FHE) with an existing template protection scheme known as PolyProtect. We show that the embeddings can be compressed and encrypted using FHE and transformed into a secure PolyProtect template using polynomial transformation, for additional protection. We demonstrate the efficacy of the proposed approach through extensive experiments on multiple datasets. Our proposed approach ensures irreversibility and unlinkability, effectively preventing the leakage of soft biometric attributes from face embeddings without compromising recognition accuracy.
现代面部识别系统利用深度神经网络从面部中提取显著特征。这些特征表示潜在空间中的嵌入,通常被面部识别系统中的模板存储。这些嵌入很容易受到数据泄漏的影响,在某些情况下,甚至可以用于重构原始面部图像。为了防止泄露身份,通常采用模板保护方案。然而,这些方案可能仍无法防止软生物特征(如年龄、性别和种族)的泄露。为了缓解这个问题,我们提出了一种结合完全同态加密(FHE)和已知模板保护方案(PolyProtect)的新技术。我们证明了使用FHE可以压缩和加密嵌入,并且可以使用多项式变换将其转换为安全的PolyProtect模板,提供额外的保护。我们通过在多个数据集上进行广泛实验,证明了所提出方法的有效性。与我们的方法相比,确保不可逆性和解链性,有效防止了未经过授权的软生物特征从面部嵌入中泄露,同时保持识别准确性。
https://arxiv.org/abs/2404.16255
Face Recognition Systems (FRS) are widely used in commercial environments, such as e-commerce and e-banking, owing to their high accuracy in real-world conditions. However, these systems are vulnerable to facial morphing attacks, which are generated by blending face color images of different subjects. This paper presents a new method for generating 3D face morphs from two bona fide point clouds. The proposed method first selects bona fide point clouds with neutral expressions. The two input point clouds were then registered using a Bayesian Coherent Point Drift (BCPD) without optimization, and the geometry and color of the registered point clouds were averaged to generate a face morphing point cloud. The proposed method generates 388 face-morphing point clouds from 200 bona fide subjects. The effectiveness of the method was demonstrated through extensive vulnerability experiments, achieving a Generalized Morphing Attack Potential (G-MAP) of 97.93%, which is superior to the existing state-of-the-art (SOTA) with a G-MAP of 81.61%.
面部识别系统(FRS)在商业环境中(如电子商务和电子银行)得到了广泛应用,因为它们在现实情况下的准确度高。然而,这些系统容易受到由不同主题混合生成面部颜色图像的变形攻击。本文提出了一种从两个真实点云生成3D面部变形的方法。与优化无关,两个输入点云使用贝叶斯一致性点漂移(BCPD)进行注册,然后平均几何和颜色生成面部变形点云。该方法从200个真实主题中生成388个面部变形点云。通过广泛的漏洞实验,该方法的有效性得到了证明,实现了97.93%的泛化形态攻击潜力(G-MAP),远高于现有状态下的81.61%。
https://arxiv.org/abs/2404.15765
Face recognition applications have grown in parallel with the size of datasets, complexity of deep learning models and computational power. However, while deep learning models evolve to become more capable and computational power keeps increasing, the datasets available are being retracted and removed from public access. Privacy and ethical concerns are relevant topics within these domains. Through generative artificial intelligence, researchers have put efforts into the development of completely synthetic datasets that can be used to train face recognition systems. Nonetheless, the recent advances have not been sufficient to achieve performance comparable to the state-of-the-art models trained on real data. To study the drift between the performance of models trained on real and synthetic datasets, we leverage a massive attribute classifier (MAC) to create annotations for four datasets: two real and two synthetic. From these annotations, we conduct studies on the distribution of each attribute within all four datasets. Additionally, we further inspect the differences between real and synthetic datasets on the attribute set. When comparing through the Kullback-Leibler divergence we have found differences between real and synthetic samples. Interestingly enough, we have verified that while real samples suffice to explain the synthetic distribution, the opposite could not be further from being true.
面部识别应用程序与数据集的大小、深度学习模型的复杂性和计算能力成正比增长。然而,尽管深度学习模型不断进化变得更具弹性和计算能力在增加,但可用的数据集正在减少和移除。隐私和伦理问题在这些领域内具有相关性。通过生成人工智能,研究人员在开发完全 synthetic 数据集以供训练面部识别系统方面付出了努力。然而,最近的研究成果尚不能达到与基于真实数据的先进模型的性能相当的水平。为了研究在真实和合成数据上训练模型的性能漂移,我们利用大规模属性分类器(MAC)为四个数据集:两个真实和两个合成创建注释:从这些注释,我们研究了每个属性的所有四个数据集中的分布。此外,我们进一步研究了真实和合成数据在属性集上的差异。通过Kullback-Leibler散度比较,我们发现了真实和合成样本之间的差异。有趣的是,我们已经证实,尽管真实样本足以解释合成分布,但相反的说法并不完全正确。
https://arxiv.org/abs/2404.15234
The wide deployment of Face Recognition (FR) systems poses risks of privacy leakage. One countermeasure to address this issue is adversarial attacks, which deceive malicious FR searches but simultaneously interfere the normal identity verification of trusted authorizers. In this paper, we propose the first Double Privacy Guard (DPG) scheme based on traceable adversarial watermarking. DPG employs a one-time watermark embedding to deceive unauthorized FR models and allows authorizers to perform identity verification by extracting the watermark. Specifically, we propose an information-guided adversarial attack against FR models. The encoder embeds an identity-specific watermark into the deep feature space of the carrier, guiding recognizable features of the image to deviate from the source identity. We further adopt a collaborative meta-optimization strategy compatible with sub-tasks, which regularizes the joint optimization direction of the encoder and decoder. This strategy enhances the representation of universal carrier features, mitigating multi-objective optimization conflicts in watermarking. Experiments confirm that DPG achieves significant attack success rates and traceability accuracy on state-of-the-art FR models, exhibiting remarkable robustness that outperforms the existing privacy protection methods using adversarial attacks and deep watermarking, or simple combinations of the two. Our work potentially opens up new insights into proactive protection for FR privacy.
广泛部署人脸识别(FR)系统会带来隐私泄露的风险。解决这个问题的一个对策是对抗性攻击,这种攻击会欺骗恶意的人脸识别,但同时会干扰可信授权者的正常身份验证。在本文中,我们提出了基于可追踪的对抗性水印的第一个双隐私保护(DPG)方案。DPG采用一次性水印嵌入来欺骗未经授权的人脸识别模型,并允许授权者通过提取水印来验证身份。具体来说,我们针对FR模型提出了信息指导的对抗性攻击。编码器将身份特定的水印嵌入到载体的深度特征空间中,引导图像的可识别特征远离源身份。我们进一步采用了一种可互补的元优化策略,该策略与子任务兼容,规范了编码器和解码器的联合优化方向。这种策略提高了普遍载荷特征的代表性,减轻了水印标记中的多目标优化冲突。实验证实,DPG在最先进的FR模型上实现了显著的攻击成功率和可追溯准确性,表现出出色的稳健性,超过使用对抗攻击和深度水印的现有隐私保护方法,或者使用简单的水印和编码器组合。我们的工作可能会为FR隐私的主动保护提供新的见解。
https://arxiv.org/abs/2404.14693
Face recognition technology has become an integral part of modern security systems and user authentication processes. However, these systems are vulnerable to spoofing attacks and can easily be circumvented. Most prior research in face anti-spoofing (FAS) approaches it as a two-class classification task where models are trained on real samples and known spoof attacks and tested for detection performance on unknown spoof attacks. However, in practice, FAS should be treated as a one-class classification task where, while training, one cannot assume any knowledge regarding the spoof samples a priori. In this paper, we reformulate the face anti-spoofing task from a one-class perspective and propose a novel hyperbolic one-class classification framework. To train our network, we use a pseudo-negative class sampled from the Gaussian distribution with a weighted running mean and propose two novel loss functions: (1) Hyp-PC: Hyperbolic Pairwise Confusion loss, and (2) Hyp-CE: Hyperbolic Cross Entropy loss, which operate in the hyperbolic space. Additionally, we employ Euclidean feature clipping and gradient clipping to stabilize the training in the hyperbolic space. To the best of our knowledge, this is the first work extending hyperbolic embeddings for face anti-spoofing in a one-class manner. With extensive experiments on five benchmark datasets: Rose-Youtu, MSU-MFSD, CASIA-MFSD, Idiap Replay-Attack, and OULU-NPU, we demonstrate that our method significantly outperforms the state-of-the-art, achieving better spoof detection performance.
面部识别技术已成为现代安全系统和用户身份验证过程的重要组成部分。然而,这些系统容易受到伪造攻击的攻击,而且可以轻松被绕过。在先前的面部抗伪造(FAS)研究中,大多数将FAS视为二分类分类任务,其中模型在真实样本和已知伪造攻击上进行训练,并在未知伪造攻击上测试检测性能。然而,在实践中,FAS应该被视为一个一分类分类任务,在训练过程中,不能假设有任何关于伪造样本的知识。在本文中,我们将从一分类的角度重新定义面部抗伪造任务,并提出了一个新的超几何一分类框架。为了训练我们的网络,我们使用从高斯分布伪负样本,带权运行平均的伪负类,并提出两个新的损失函数:(1) Hyp-PC:超几何对偶混淆损失,和(2) Hyp-CE:超几何交叉熵损失,它们在超几何空间中操作。此外,我们还使用欧氏特征截断和梯度截断来稳定超几何空间中的训练。据我们所知,这是第一个在一类方式上扩展超几何嵌入用于面部抗伪造的工作。在五个基准数据集:罗切斯特县,密苏里大学MSU-MFSD,卡西娅大学CASIA-MFSD,碘亚帕回复攻击和OULU-NPU的广泛实验中,我们证明了我们的方法在性能上明显优于现有技术水平,实现了更好的伪造检测性能。
https://arxiv.org/abs/2404.14406
Heterogeneous Face Recognition (HFR) aims to expand the applicability of Face Recognition (FR) systems to challenging scenarios, enabling the matching of face images across different domains, such as matching thermal images to visible spectra. However, the development of HFR systems is challenging because of the significant domain gap between modalities and the lack of availability of large-scale paired multi-channel data. In this work, we leverage a pretrained face recognition model as a teacher network to learn domaininvariant network layers called Domain-Invariant Units (DIU) to reduce the domain gap. The proposed DIU can be trained effectively even with a limited amount of paired training data, in a contrastive distillation framework. This proposed approach has the potential to enhance pretrained models, making them more adaptable to a wider range of variations in data. We extensively evaluate our approach on multiple challenging benchmarks, demonstrating superior performance compared to state-of-the-art methods.
异质面部识别(HFR)旨在将面部识别(FR)系统的应用扩展到具有挑战性的场景中,实现不同领域间面部图像的匹配,例如将热成像与可见光谱进行匹配。然而,由于模态之间的显著差异和大规模多通道数据缺乏,HFR系统的发展具有挑战性。在这项工作中,我们利用预训练的人脸识别模型作为教师网络,学习领域无关网络层,称为领域无关单元(DIU),以减少领域差距。所提出的DIU可以在训练过程中有效处理有限量的成对训练数据,并通过对比性蒸馏框架实现有效训练。这种方法具有提高预训练模型的潜力,使它们对数据中的更广泛的变异性具有更强的适应性。我们对我们的方法在多个具有挑战性的基准进行了广泛评估,证明了其在最先进的methods之上的卓越性能。
https://arxiv.org/abs/2404.14343
Heterogeneous Face Recognition (HFR) focuses on matching faces from different domains, for instance, thermal to visible images, making Face Recognition (FR) systems more versatile for challenging scenarios. However, the domain gap between these domains and the limited large-scale datasets in the target HFR modalities make it challenging to develop robust HFR models from scratch. In our work, we view different modalities as distinct styles and propose a method to modulate feature maps of the target modality to address the domain gap. We present a new Conditional Adaptive Instance Modulation (CAIM ) module that seamlessly fits into existing FR networks, turning them into HFR-ready systems. The CAIM block modulates intermediate feature maps, efficiently adapting to the style of the source modality and bridging the domain gap. Our method enables end-to-end training using a small set of paired samples. We extensively evaluate the proposed approach on various challenging HFR benchmarks, showing that it outperforms state-of-the-art methods. The source code and protocols for reproducing the findings will be made publicly available
异质面部识别(HFR)关注于不同领域(例如,热图像到可见图像)的匹配,使面部识别(FR)系统在具有挑战性的场景更具多样性。然而,这些领域与目标HFR模态之间存在的领域差距以及目标HFR模态中有限的大规模数据集使从零开始开发鲁棒HFR模型具有挑战性。在我们的工作中,我们将不同模块视为不同的样式,并提出了一个方法来调节目标模态的特征图以解决领域差距。我们提出了一个名为条件自适应实例调制(CAIM)的模块,它与现有的FR网络无缝集成,使它们成为HFR ready的系统。CAIM模块调节中间特征图,有效地适应源模态的风格,并弥合领域差距。我们的方法使用一对配对样本进行端到端训练。我们在各种具有挑战性的HFR基准中广泛评估所提出的方案,结果表明,它超过了最先进的方法。复制这些发现的源代码和协议将公开发布。
https://arxiv.org/abs/2404.14247
Facial biometrics are an essential components of smartphones to ensure reliable and trustworthy authentication. However, face biometric systems are vulnerable to Presentation Attacks (PAs), and the availability of more sophisticated presentation attack instruments such as 3D silicone face masks will allow attackers to deceive face recognition systems easily. In this work, we propose a novel Presentation Attack Detection (PAD) algorithm based on 3D point clouds captured using the frontal camera of a smartphone to detect presentation attacks. The proposed PAD algorithm, VoxAtnNet, processes 3D point clouds to obtain voxelization to preserve the spatial structure. Then, the voxelized 3D samples were trained using the novel convolutional attention network to detect PAs on the smartphone. Extensive experiments were carried out on the newly constructed 3D face point cloud dataset comprising bona fide and two different 3D PAIs (3D silicone face mask and wrap photo mask), resulting in 3480 samples. The performance of the proposed method was compared with existing methods to benchmark the detection performance using three different evaluation protocols. The experimental results demonstrate the improved performance of the proposed method in detecting both known and unknown face presentation attacks.
面部生物识别是确保智能手机可靠且值得信赖的认证的重要组成部分。然而,面部生物识别系统容易受到展示攻击(PAs)的影响,而且比例如3D硅胶面部口罩等更复杂的展示攻击工具将使攻击者轻松欺骗面部识别系统。在这项工作中,我们提出了一种基于智能手机前摄像头捕获的3D点云的新型展示攻击检测(PAD)算法来检测展示攻击。所提出的PAD算法,VoxAtnNet,对3D点云进行处理以实现体素化以保留空间结构。然后,使用新颖的卷积注意网络对体素化的3D样本进行训练,以检测智能手机上的PAs。在构建了包含真实和两种不同3D PPI(3D硅胶面部口罩和贴纸照片面具)的新建3D面部点云数据集上进行了大量实验,结果产生了3480个样本。将所提出的方法与现有方法进行比较,以通过三种不同的评估协议 benchmark检测性能。实验结果表明,与已知和未知面部展示攻击相比,所提出的方法在检测方面都取得了显著改进。
https://arxiv.org/abs/2404.12680
Face-morphing attacks are a growing concern for biometric researchers, as they can be used to fool face recognition systems (FRS). These attacks can be generated at the image level (supervised) or representation level (unsupervised). Previous unsupervised morphing attacks have relied on generative adversarial networks (GANs). More recently, researchers have used linear interpolation of StyleGAN-encoded images to generate morphing attacks. In this paper, we propose a new method for generating high-quality morphing attacks using StyleGAN disentanglement. Our approach, called MLSD-GAN, spherically interpolates the disentangled latents to produce realistic and diverse morphing attacks. We evaluate the vulnerability of MLSD-GAN on two deep-learning-based FRS techniques. The results show that MLSD-GAN poses a significant threat to FRS, as it can generate morphing attacks that are highly effective at fooling these systems.
面部变形攻击是生物特征研究者越来越关注的问题,因为这些攻击可以使用户欺骗人脸识别系统(FRS)。这些攻击可以在图像级别(监督)或表示级别(无监督)进行生成。之前无监督的变形攻击依赖于生成对抗网络(GANs)。更最近,研究人员使用StyleGAN编码的图像的线性插值生成变形攻击。在本文中,我们提出了一种利用StyleGAN解离生成新方法来生成高质量变形攻击。我们的方法称为MLSD-GAN,它使用球形插值来解离变差的 latent,产生真实和多样化的变形攻击。我们评估了MLSD-GAN在两种基于深度学习的FRS技术上的安全性。结果显示,MLSD-GAN对FRS构成了显著威胁,因为它可以生成欺骗性效果很强的变形攻击。
https://arxiv.org/abs/2404.12679
Spreadsheets are widely recognized as the most popular end-user programming tools, which blend the power of formula-based computation, with an intuitive table-based interface. Today, spreadsheets are used by billions of users to manipulate tables, most of whom are neither database experts nor professional programmers. Despite the success of spreadsheets, authoring complex formulas remains challenging, as non-technical users need to look up and understand non-trivial formula syntax. To address this pain point, we leverage the observation that there is often an abundance of similar-looking spreadsheets in the same organization, which not only have similar data, but also share similar computation logic encoded as formulas. We develop an Auto-Formula system that can accurately predict formulas that users want to author in a target spreadsheet cell, by learning and adapting formulas that already exist in similar spreadsheets, using contrastive-learning techniques inspired by "similar-face recognition" from compute vision. Extensive evaluations on over 2K test formulas extracted from real enterprise spreadsheets show the effectiveness of Auto-Formula over alternatives. Our benchmark data is available at this https URL to facilitate future research.
电子表格被广泛认为是用户最喜爱的开发工具,它将基于公式的计算力量与直观的表格界面相结合。如今,电子表格被数十亿人用于操作表格,其中大多数用户既不是数据库专家也不是专业程序员。尽管电子表格取得了成功,但创建复杂公式仍然具有挑战性,因为非技术用户需要查找并理解非琐碎的公式语法。为了应对这个痛点,我们利用观察到同一组织中通常有很多类似外观的电子表格这一事实,这些电子表格不仅具有类似的数据,而且共享相似的计算逻辑,作为公式编码。我们开发了一种自动公式系统,可以准确预测用户希望在目标电子表格单元格中创建的公式,通过使用与计算视觉中的“相似脸识别”技术灵感相同的对比学习方法来学习并适应现有的类似电子表格中的公式。对来自真实企业电子表格的2K个测试公式的广泛评估显示,自动公式比其他方法更有效。我们的基准数据可在此处访问,以促进未来研究:https://www.example.com/。
https://arxiv.org/abs/2404.12608
In recent years, Face Anti-Spoofing (FAS) has played a crucial role in preserving the security of face recognition technology. With the rise of counterfeit face generation techniques, the challenge posed by digitally edited faces to face anti-spoofing is escalating. Existing FAS technologies primarily focus on intercepting physically forged faces and lack a robust solution for cross-domain FAS challenges. Moreover, determining an appropriate threshold to achieve optimal deployment results remains an issue for intra-domain FAS. To address these issues, we propose a visualization method that intuitively reflects the training outcomes of models by visualizing the prediction results on datasets. Additionally, we demonstrate that employing data augmentation techniques, such as downsampling and Gaussian blur, can effectively enhance performance on cross-domain tasks. Building upon our data visualization approach, we also introduce a methodology for setting threshold values based on the distribution of the training dataset. Ultimately, our methods secured us second place in both the Unified Physical-Digital Face Attack Detection competition and the Snapshot Spectral Imaging Face Anti-spoofing contest. The training code is available at this https URL.
近年来,面部抗伪造(FAS)在保护脸部识别技术的安全方面发挥了关键作用。随着伪造人脸生成技术的出现,数字编辑人脸对面部抗伪造技术的挑战不断升级。现有的FAS技术主要关注拦截物理伪造人脸,缺乏跨域FAS挑战的稳健解决方案。此外,确定适当的阈值以实现最佳部署结果仍然是内部FAS领域的难题。为了解决这些问题,我们提出了一个自直观地反映模型训练结果的可视化方法。此外,我们还证明了使用数据增强技术(如下采样和高斯模糊)可以在跨域任务上有效提高性能。基于我们的数据可视化方法,我们还引入了一种根据训练数据分布设置阈值的方法。最终,我们的方法在统一物理-数字脸部攻击检测比赛和快照光谱成像脸部抗伪造竞赛中获得了第二名的成绩。训练代码可在此处访问:https://www.url。
https://arxiv.org/abs/2404.12602
Face Image Quality Assessment (FIQA) estimates the utility of face images for automated face recognition (FR) systems. We propose in this work a novel approach to assess the quality of face images based on inspecting the required changes in the pre-trained FR model weights to minimize differences between testing samples and the distribution of the FR training dataset. To achieve that, we propose quantifying the discrepancy in Batch Normalization statistics (BNS), including mean and variance, between those recorded during FR training and those obtained by processing testing samples through the pretrained FR model. We then generate gradient magnitudes of pretrained FR weights by backpropagating the BNS through the pretrained model. The cumulative absolute sum of these gradient magnitudes serves as the FIQ for our approach. Through comprehensive experimentation, we demonstrate the effectiveness of our training-free and quality labeling-free approach, achieving competitive performance to recent state-of-theart FIQA approaches without relying on quality labeling, the need to train regression networks, specialized architectures, or designing and optimizing specific loss functions.
面部图像质量评估(FIQA)估计面部图像对自动面部识别(FR)系统的利用率。在本文中,我们提出了一种新方法来评估面部图像的质量,即根据检查预训练FR模型权重所需的更改来最小化测试样本与FR训练数据分布之间的差异。为了实现这一目标,我们提出计算在FR训练期间记录的BNS之间差异的方差,包括均值和方差,以及通过预训练FR模型处理测试样本获得的BNS之间的差异。然后,通过反向传播算法计算预训练FR权重的梯度大小。这些梯度大小的累积绝对和作为FIQ。通过全面的实验,我们证明了我们无需训练和免费的质量标注方法的有效性,实现了与最近 state-of-the-art FIQA 方法竞争的性能,而无需依赖质量标注。我们证明了不需要训练回归网络、专用架构或设计并优化特定的损失函数。
https://arxiv.org/abs/2404.12203
Face recognition (FR) has seen significant advancements due to the utilization of large-scale datasets. Training deep FR models on large-scale datasets with multiple GPUs is now a common practice. In fact, computing power has evolved into a foundational and indispensable resource in the area of deep learning. It is nearly impossible to train a deep FR model without holding adequate hardware resources. Recognizing this challenge, some FR approaches have started exploring ways to reduce the time complexity of the fully-connected layer in FR models. Unlike other approaches, this paper introduces a simple yet highly effective approach, Moving Haar Learning Rate (MHLR) scheduler, for scheduling the learning rate promptly and accurately in the training process. MHLR supports large-scale FR training with only one GPU, which is able to accelerate the model to 1/4 of its original training time without sacrificing more than 1% accuracy. More specifically, MHLR only needs $30$ hours to train the model ResNet100 on the dataset WebFace12M containing more than 12M face images with 0.6M identities. Extensive experiments validate the efficiency and effectiveness of MHLR.
面部识别(FR)在利用大规模数据集的情况下取得了显著的进步。现在,在大型数据集上使用多个GPU训练深度FR模型是一种常见的做法。事实上,计算能力在深度学习领域已经发展成为一个基本和不可或缺的资源。如果没有足够的硬件资源,训练深度FR模型几乎是不可能的。认识到这个挑战,一些FR方法已经开始探索如何减少FR模型中全连接层的时间复杂度。与其他方法不同,本文介绍了一种简单而高效的方法——移动 Haar 学习率(MHLR)调度程序,用于在训练过程中及时、准确地降低学习率。MHLR支持使用单个GPU进行大规模FR训练,能够在不牺牲超过1%的准确度的情况下将模型加速到原训练时间的1/4。具体来说,MHLR只需要30个小时来训练包含12M个面部图像且每个图像有0.6M个唯一身份的WebFace12M数据集的ResNet100模型。大量实验证实了MHLR的高效和有效性。
https://arxiv.org/abs/2404.11118
Nowadays, deep learning models have reached incredible performance in the task of image generation. Plenty of literature works address the task of face generation and editing, with human and automatic systems that struggle to distinguish what's real from generated. Whereas most systems reached excellent visual generation quality, they still face difficulties in preserving the identity of the starting input subject. Among all the explored techniques, Semantic Image Synthesis (SIS) methods, whose goal is to generate an image conditioned on a semantic segmentation mask, are the most promising, even though preserving the perceived identity of the input subject is not their main concern. Therefore, in this paper, we investigate the problem of identity preservation in face image generation and present an SIS architecture that exploits a cross-attention mechanism to merge identity, style, and semantic features to generate faces whose identities are as similar as possible to the input ones. Experimental results reveal that the proposed method is not only suitable for preserving the identity but is also effective in the face recognition adversarial attack, i.e. hiding a second identity in the generated faces.
如今,在图像生成任务中,深度学习模型已经达到了惊人的表现。大量文献都研究了面向人脸生成和编辑的任务,其中人类和自动系统很难区分真实和生成内容。虽然大多数系统都达到了出色的视觉生成质量,但它们仍然面临着保留输入主题身份的困难。在所有探索的技术中,语义图像合成(SIS)方法最具前景,尽管它们保留输入主题身份的不是它们的主要关注点。因此,在本文中,我们研究了人脸图像生成中身份保留的问题,并提出了一个SIS架构,它利用交叉注意机制将身份、风格和语义特征合并以生成尽可能与输入主题相似的面孔。实验结果表明,所提出的方法不仅适合保留身份,而且在面部识别对抗攻击中也非常有效,即在生成面部时隐藏第二个身份。
https://arxiv.org/abs/2404.10408
Synthetic data is gaining increasing relevance for training machine learning models. This is mainly motivated due to several factors such as the lack of real data and intra-class variability, time and errors produced in manual labeling, and in some cases privacy concerns, among others. This paper presents an overview of the 2nd edition of the Face Recognition Challenge in the Era of Synthetic Data (FRCSyn) organized at CVPR 2024. FRCSyn aims to investigate the use of synthetic data in face recognition to address current technological limitations, including data privacy concerns, demographic biases, generalization to novel scenarios, and performance constraints in challenging situations such as aging, pose variations, and occlusions. Unlike the 1st edition, in which synthetic data from DCFace and GANDiffFace methods was only allowed to train face recognition systems, in this 2nd edition we propose new sub-tasks that allow participants to explore novel face generative methods. The outcomes of the 2nd FRCSyn Challenge, along with the proposed experimental protocol and benchmarking contribute significantly to the application of synthetic data to face recognition.
合成数据在训练机器学习模型中的重要性逐渐增加。这主要是因为缺少真实数据和类内变异性、手动标注中产生的时间和错误等因素。此外,在某些情况下还涉及到隐私问题等一些因素。本文概述了在CVPR 2024举办的第一届合成数据时代(FRCSyn)人脸识别挑战赛。FRCSyn旨在研究在合成数据时代使用合成数据解决当前技术限制,包括数据隐私问题、人口偏见、对新场景的泛化能力和挑战情况下的性能约束等。与第一版不同,第一版只允许来自DCFace和GANDiffFace方法的合成数据训练人脸识别系统。而第二版的人脸识别挑战赛提出了新的子任务,允许参赛者探索新颖的人脸生成方法。第二FRCSyn挑战赛的结果以及提出的实验协议和基准测试对将合成数据应用于人脸识别具有显著推动作用。
https://arxiv.org/abs/2404.10378
Face Image Quality Assessment (FIQA) techniques have seen steady improvements over recent years, but their performance still deteriorates if the input face samples are not properly aligned. This alignment sensitivity comes from the fact that most FIQA techniques are trained or designed using a specific face alignment procedure. If the alignment technique changes, the performance of most existing FIQA techniques quickly becomes suboptimal. To address this problem, we present in this paper a novel knowledge distillation approach, termed AI-KD that can extend on any existing FIQA technique, improving its robustness to alignment variations and, in turn, performance with different alignment procedures. To validate the proposed distillation approach, we conduct comprehensive experiments on 6 face datasets with 4 recent face recognition models and in comparison to 7 state-of-the-art FIQA techniques. Our results show that AI-KD consistently improves performance of the initial FIQA techniques not only with misaligned samples, but also with properly aligned facial images. Furthermore, it leads to a new state-of-the-art, when used with a competitive initial FIQA approach. The code for AI-KD is made publicly available from: this https URL.
近年来,面部图像质量评估(FIQA)技术一直在稳步改进,但只要输入的面部样本没有正确对齐,它们的性能就会恶化。这种对齐敏感性源于大多数FIQA技术都是通过特定的面部对齐方法进行训练或设计的。如果对齐技术发生变化,大多数现有FIQA技术的性能会迅速变得次优。为解决这个问题,本文提出了一种新颖的蒸馏方法,称为AI-KD,可以扩展任何现有的FIQA技术,提高其对对齐变化的一致性和相应地提高性能。为了验证所提出的蒸馏方法,我们在6个面部数据集上与4个最近的面部识别模型进行了全面的实验,并将其与7个最先进的FIQA技术进行了比较。我们的结果表明,AI-KD不仅能够提高错位样本的FIQA技术的性能,还能够正确对齐面部图像的FIQA技术。此外,当与竞争性的初始FIQA方法结合时,它导致了一种新的最先进的FIQA技术。AI-KD的代码已公开发布在这个链接:https://this URL。
https://arxiv.org/abs/2404.09555
Face anti-spoofing (FAS) and adversarial detection (FAD) have been regarded as critical technologies to ensure the safety of face recognition systems. As a consequence of their limited practicality and generalization, some existing methods aim to devise a framework capable of concurrently detecting both threats to address the challenge. Nevertheless, these methods still encounter challenges of insufficient generalization and suboptimal robustness, potentially owing to the inherent drawback of discriminative models. Motivated by the rich structural and detailed features of face generative models, we propose FaceCat which utilizes the face generative model as a pre-trained model to improve the performance of FAS and FAD. Specifically, FaceCat elaborately designs a hierarchical fusion mechanism to capture rich face semantic features of the generative model. These features then serve as a robust foundation for a lightweight head, designed to execute FAS and FAD tasks simultaneously. As relying solely on single-modality data often leads to suboptimal performance, we further propose a novel text-guided multi-modal alignment strategy that utilizes text prompts to enrich feature representation, thereby enhancing performance. For fair evaluations, we build a comprehensive protocol with a wide range of 28 attack types to benchmark the performance. Extensive experiments validate the effectiveness of FaceCat generalizes significantly better and obtains excellent robustness against input transformations.
脸部抗伪造(FAS)和对抗检测(FAD)被认为是确保面部识别系统安全的关键技术。由于其有限的可行性和泛化能力,一些现有方法旨在设计一个能够同时检测这两种威胁的框架来应对挑战。然而,这些方法仍然遇到了泛化不足和劣劣的鲁棒性问题,可能是因为判别模型的固有缺陷。受到面部生成模型的丰富结构和详细特征的启发,我们提出了FaceCat,它利用预训练的生成模型作为基础来提高FAS和FAD的性能。具体来说,FaceCat详细设计了一个层次融合机制,以捕捉生成模型的丰富脸部语义特征。这些特征 then 作为轻量级头的稳健基础,旨在同时执行FAS和FAD任务。由于仅依赖单模态数据往往导致性能 suboptimal,我们还提出了一个新颖的基于文本的多模态对齐策略,利用文本提示来丰富特征表示,从而提高性能。为了进行公平评估,我们建立了一个包含28种攻击类型的全面协议来对比性能。 extensive实验证实,FaceCat的泛化效果显著更好,对输入变换具有出色的鲁棒性。
https://arxiv.org/abs/2404.09193
Face recognition systems are frequently subjected to a variety of physical and digital attacks of different types. Previous methods have achieved satisfactory performance in scenarios that address physical attacks and digital attacks, respectively. However, few methods are considered to integrate a model that simultaneously addresses both physical and digital attacks, implying the necessity to develop and maintain multiple models. To jointly detect physical and digital attacks within a single model, we propose an innovative approach that can adapt to any network architecture. Our approach mainly contains two types of data augmentation, which we call Simulated Physical Spoofing Clues augmentation (SPSC) and Simulated Digital Spoofing Clues augmentation (SDSC). SPSC and SDSC augment live samples into simulated attack samples by simulating spoofing clues of physical and digital attacks, respectively, which significantly improve the capability of the model to detect "unseen" attack types. Extensive experiments show that SPSC and SDSC can achieve state-of-the-art generalization in Protocols 2.1 and 2.2 of the UniAttackData dataset, respectively. Our method won first place in "Unified Physical-Digital Face Attack Detection" of the 5th Face Anti-spoofing Challenge@CVPR2024. Our final submission obtains 3.75% APCER, 0.93% BPCER, and 2.34% ACER, respectively. Our code is available at this https URL.
面部识别系统常常会受到各种类型的物理和数字攻击。之前的方法在处理物理攻击和数字攻击的场景方面都取得了相当不错的性能。然而,很少有方法被认为整合了一个同时处理物理和数字攻击的模型,这表明需要开发和维护多个模型。为了在单个模型中共同检测物理和数字攻击,我们提出了一个创新的方法,可以适应任何网络架构。我们主要包含两种数据增强类型,我们称之为模拟物理 spoofing 线索增强(SPSC)和模拟数字 spoofing 线索增强(SDSC)。SPSC和SDSC通过模拟物理和数字攻击的 spoofing 线索将实时样本转换为模拟攻击样本,显著提高了模型的检测“未见”攻击类型的能力。广泛的实验证明,SPSC和SDSC可以在UniAttackData数据集中的Protocol 2.1和2.2上实现最先进的泛化。我们的方法在2024年CVPR的“统一物理-数字面部攻击检测”挑战中获得第一。我们的最终提交获得了3.75%的APCER,0.93%的BPCER和2.34%的ACER。我们的代码可在此处下载:https://www.thunlock.org/thunlock/。
https://arxiv.org/abs/2404.08450
With the advent of social media, fun selfie filters have come into tremendous mainstream use affecting the functioning of facial biometric systems as well as image recognition systems. These filters vary from beautification filters and Augmented Reality (AR)-based filters to filters that modify facial landmarks. Hence, there is a need to assess the impact of such filters on the performance of existing face recognition systems. The limitation associated with existing solutions is that these solutions focus more on the beautification filters. However, the current AR-based filters and filters which distort facial key points are in vogue recently and make the faces highly unrecognizable even to the naked eye. Also, the filters considered are mostly obsolete with limited variations. To mitigate these limitations, we aim to perform a holistic impact analysis of the latest filters and propose an user recognition model with the filtered images. We have utilized a benchmark dataset for baseline images, and applied the latest filters over them to generate a beautified/filtered dataset. Next, we have introduced a model FaceFilterNet for beautified user recognition. In this framework, we also utilize our model to comment on various attributes of the person including age, gender, and ethnicity. In addition, we have also presented a filter-wise impact analysis on face recognition, age estimation, gender, and ethnicity prediction. The proposed method affirms the efficacy of our dataset with an accuracy of 87.25% and an optimal accuracy for facial attribute analysis.
随着社交媒体的出现,有趣的自拍滤镜已经进入了 mainstream 使用,对面部生物特征系统和图像识别系统产生了重大影响。这些滤镜从美颜滤镜和基于增强现实 (AR) 的滤镜到修改面部特征的滤镜。因此,有必要评估这类滤镜对现有面部识别系统性能的影响。现有解决方案的局限性在于,这些解决方案更关注美颜滤镜。然而,当前的 AR 基滤镜和扭曲面部关键点的滤镜最近很流行,使脸部高度难以识别,甚至对裸眼观察者来说也是如此。此外,考虑的滤镜大多是过时的,且变化有限。为了减轻这些限制,我们旨在对最新滤镜进行全面的评估,并提出了带有滤镜的用户识别模型。我们在基准图像上利用了基准数据集,并应用最新滤镜生成一个美颜/滤镜化数据集。接下来,我们引入了 FaceFilterNet 模型进行美颜用户识别。在这个框架下,我们还利用我们的模型来评论人员的各种属性,包括年龄、性别和种族。此外,我们还对面部识别、年龄估计、性别和种族预测进行了滤镜逐个影响分析。所提出的方法证实了我们的数据集的有效性,准确率为 87.25%,面部属性分析的最优准确率。
https://arxiv.org/abs/2404.08277
The advent of morphing attacks has posed significant security concerns for automated Face Recognition systems, raising the pressing need for robust and effective Morphing Attack Detection (MAD) methods able to effectively address this issue. In this paper, we focus on Differential MAD (D-MAD), where a trusted live capture, usually representing the criminal, is compared with the document image to classify it as morphed or bona fide. We show these approaches based on identity features are effective when the morphed image and the live one are sufficiently diverse; unfortunately, the effectiveness is significantly reduced when the same approaches are applied to look-alike subjects or in all those cases when the similarity between the two compared images is high (e.g. comparison between the morphed image and the accomplice). Therefore, in this paper, we propose ACIdA, a modular D-MAD system, consisting of a module for the attempt type classification, and two modules for the identity and artifacts analysis on input images. Successfully addressing this task would allow broadening the D-MAD applications including, for instance, the document enrollment stage, which currently relies entirely on human evaluation, thus limiting the possibility of releasing ID documents with manipulated images, as well as the automated gates to detect both accomplices and criminals. An extensive cross-dataset experimental evaluation conducted on the introduced scenario shows that ACIdA achieves state-of-the-art results, outperforming literature competitors, while maintaining good performance in traditional D-MAD benchmarks.
随着融合攻击的出现,自动人脸识别系统面临着显著的安全问题,这也使得我们迫切需要能够有效应对这一问题的强大和有效的融合攻击检测(FAD)方法。在本文中,我们重点关注差分FAD(D-FAD),其中可信的活捉通常代表犯罪者,与文档图像进行比较以分类它为融合或真实。我们证明了基于身份特征的方法在融合图像和活捉图像足够 diverse 时是有效的;然而,当同样的方法应用于模拟对象或所有相同图像的相似度很高时(例如,融合图像与同伙的比较),效果会显著降低(例如,融合图像与同伙的比较)。因此,在本文中,我们提出了ACIdA,一种模块化的D-FAD系统,包括一个尝试类型分类模块和一个用于输入图像的身份和 artifacts 分析模块。成功解决此任务将使D-FAD应用范围扩大,包括诸如文档入学阶段等,该阶段完全依赖于人工评估,因此限制了发布带有编辑图像的ID文件以及检测两者(犯罪者和同伙)的可能性。在一个介绍的场景进行的广泛跨数据集实验评估显示,ACIdA取得了最先进的结果,超越了文献竞争对手,同时保持传统D-FAD基准测试中的良好性能。
https://arxiv.org/abs/2404.07667