Paper Reading AI Learner

Considerations on the Evaluation of Biometric Quality Assessment Algorithms

2023-03-23 14:26:21
Torsten Schlett, Christian Rathgeb, Juan Tapia, Christoph Busch

Abstract

Quality assessment algorithms can be used to estimate the utility of a biometric sample for the purpose of biometric recognition. "Error versus Discard Characteristic" (EDC) plots, and "partial Area Under Curve" (pAUC) values of curves therein, are generally used by researchers to evaluate the predictive performance of such quality assessment algorithms. An EDC curve depends on an error type such as the "False Non Match Rate" (FNMR), a quality assessment algorithm, a biometric recognition system, a set of comparisons each corresponding to a biometric sample pair, and a comparison score threshold corresponding to a starting error. To compute an EDC curve, comparisons are progressively discarded based on the associated samples' lowest quality scores, and the error is computed for the remaining comparisons. Additionally, a discard fraction limit or range must be selected to compute pAUC values, which can then be used to quantitatively rank quality assessment algorithms. This paper discusses and analyses various details for this kind of quality assessment algorithm evaluation, including general EDC properties, interpretability improvements for pAUC values based on a hard lower error limit and a soft upper error limit, the use of relative instead of discrete rankings, stepwise vs. linear curve interpolation, and normalisation of quality scores to a [0, 100] integer range. We also analyse the stability of quantitative quality assessment algorithm rankings based on pAUC values across varying pAUC discard fraction limits and starting errors, concluding that higher pAUC discard fraction limits should be preferred. The analyses are conducted both with synthetic data and with real data for a face image quality assessment scenario, with a focus on general modality-independent conclusions for EDC evaluations.

Abstract (translated)

质量评估算法可以用来估计一个生物特征样本用于生物特征识别的有用性。 "错误-排除特征" (EDC) 绘图和曲线中的 "部分平均面积" (pAUC) 值通常被研究人员用于评估这些质量评估算法的预测性能。 EDC 曲线取决于错误类型,例如 "False 不匹配率" (FNMR), 质量评估算法,生物特征识别系统,一组对应于生物特征样本对的比较,以及一个比较分数阈值对应于起始错误。为了计算 EDC 曲线,比较是按相关样本的最低质量分数逐步排除的,而错误是计算剩余的比较。此外,必须选择计算 pAUC 值的范围来计算 pAUC 值,然后用于定量评估质量评估算法。 本 paper 讨论和分析了评估这种质量评估算法评估的各种细节,包括 general EDC 特性,基于 hard 的低错误限制和 soft 的高等错误限制的 pAUC 值的解释性改进,使用相对排名而不是离散排名, stepwise 和线性曲线插值,以及质量分数的归一化到 [0, 100] 整数范围内。我们还分析基于 pAUC 值在不同 pAUC 丢弃 fraction 限制和起始错误的稳定性,得出结论较高 pAUC 丢弃 fraction 限制应该优先考虑。 分析使用了合成数据和真实数据的一个面部图像质量评估场景,专注于 EDC 评估的一般模式无关结论。

URL

https://arxiv.org/abs/2303.13294

PDF

https://arxiv.org/pdf/2303.13294.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot