Paper Reading AI Learner

Mask-up: Investigating Biases in Face Re-identification for Masked Faces

2024-02-21 12:48:45
Siddharth D Jaiswal, Ankit Kr. Verma, Animesh Mukherjee

Abstract

AI based Face Recognition Systems (FRSs) are now widely distributed and deployed as MLaaS solutions all over the world, moreso since the COVID-19 pandemic for tasks ranging from validating individuals' faces while buying SIM cards to surveillance of citizens. Extensive biases have been reported against marginalized groups in these systems and have led to highly discriminatory outcomes. The post-pandemic world has normalized wearing face masks but FRSs have not kept up with the changing times. As a result, these systems are susceptible to mask based face occlusion. In this study, we audit four commercial and nine open-source FRSs for the task of face re-identification between different varieties of masked and unmasked images across five benchmark datasets (total 14,722 images). These simulate a realistic validation/surveillance task as deployed in all major countries around the world. Three of the commercial and five of the open-source FRSs are highly inaccurate; they further perpetuate biases against non-White individuals, with the lowest accuracy being 0%. A survey for the same task with 85 human participants also results in a low accuracy of 40%. Thus a human-in-the-loop moderation in the pipeline does not alleviate the concerns, as has been frequently hypothesized in literature. Our large-scale study shows that developers, lawmakers and users of such services need to rethink the design principles behind FRSs, especially for the task of face re-identification, taking cognizance of observed biases.

Abstract (translated)

基于AI的Face Recognition系统(FRS)现在在全球范围内广泛分布并作为MLaaS解决方案部署,尤其是在COVID-19大流行期间,例如验证个人购买SIM卡时的人脸识别和监视公民等任务。在这些系统中,针对边缘化群体的偏见报道较多,导致高度歧视性结果。大流行过后,世界已经正常化戴口罩,但FRS并没有跟上时代的变化。因此,这些系统容易受到口罩为基础的人脸遮挡。在本文中,我们对四个商业和九个开源FRS进行了审计,针对不同口罩和未戴口罩的图像进行人脸识别,跨越五个基准数据集(总共14,722张图片)。这些模拟了一个真实世界的验证/监视任务,类似于全球各国广泛部署的任务。三个商业和五个开源FRS高度不准确;它们进一步加剧了针对非白人居民的偏见,最低准确度仅为0%。同样,使用85名人类参与者的相同任务调查也结果不理想,准确度只有40%。因此,在流水线中的人类-闭环管理并不能减轻这些担忧,正如文献中经常假设的那样。 我们的大规模研究显示,开发人员、立法者和使用这些服务的用户需要重新思考FRS的设计原则,尤其是针对人脸识别任务,要关注所观察到的偏见。

URL

https://arxiv.org/abs/2402.13771

PDF

https://arxiv.org/pdf/2402.13771.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot