Paper Reading AI Learner

Mask Attack Detection Using Vascular-weighted Motion-robust rPPG Signals

2023-05-25 11:22:17
Chenglin Yao, Jianfeng Ren, Ruibin Bai, Heshan Du, Jiang Liu, Xudong Jiang

Abstract

Detecting 3D mask attacks to a face recognition system is challenging. Although genuine faces and 3D face masks show significantly different remote photoplethysmography (rPPG) signals, rPPG-based face anti-spoofing methods often suffer from performance degradation due to unstable face alignment in the video sequence and weak rPPG signals. To enhance the rPPG signal in a motion-robust way, a landmark-anchored face stitching method is proposed to align the faces robustly and precisely at the pixel-wise level by using both SIFT keypoints and facial landmarks. To better encode the rPPG signal, a weighted spatial-temporal representation is proposed, which emphasizes the face regions with rich blood vessels. In addition, characteristics of rPPG signals in different color spaces are jointly utilized. To improve the generalization capability, a lightweight EfficientNet with a Gated Recurrent Unit (GRU) is designed to extract both spatial and temporal features from the rPPG spatial-temporal representation for classification. The proposed method is compared with the state-of-the-art methods on five benchmark datasets under both intra-dataset and cross-dataset evaluations. The proposed method shows a significant and consistent improvement in performance over other state-of-the-art rPPG-based methods for face spoofing detection.

Abstract (translated)

检测面部识别系统的三维口罩攻击是一项挑战性的任务。虽然真实的面部和3D口罩显示显著不同的远程光偏振测量(rPPG)信号,但基于rPPG的面部反伪造方法经常由于视频序列中面部不稳定性以及较弱的rPPG信号而性能下降。为了在运动条件下增强rPPG信号,一种地标性框架面部拼接方法被提出,通过同时使用SIFT关键点和面部地标来 robustly and precisely align the faces at the pixel-level。为了更好地编码rPPG信号,一种加权时间和空间表示被提出,该表示强调具有丰富血管的面部区域。此外,不同颜色空间中的rPPG信号特征也被共同利用。为了提高泛化能力,一种轻量级高效的神经网络和一个门控循环单元(GRU)被设计,从rPPG时间和空间表示中分别提取空间和时间特征来进行分类。在内部数据集和跨数据集评估中,该方法与最先进的方法进行了比较。该方法在面部仿冒检测中的表现比其他任何基于rPPG的面部伪造方法都显著提高。

URL

https://arxiv.org/abs/2305.15940

PDF

https://arxiv.org/pdf/2305.15940.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot