Paper Reading AI Learner

A Double-Deep Spatio-Angular Learning Framework for Light Field based Face Recognition

2019-04-24 23:12:01
Alireza Sepas-Moghaddam, Mohammad A. Haque, Paulo Lobato Correia, Kamal Nasrollahi, Thomas B. Moeslund, Fernando Pereira

Abstract

Face recognition has attracted increasing attention due to its wide range of applications, but it is still challenging when facing large variations in the biometric data characteristics. Lenslet light field cameras have recently come into prominence to capture rich spatio-angular information, thus offering new possibilities for advanced biometric recognition systems. This paper proposes a double-deep spatio-angular learning framework for light field based face recognition, which is able to learn both texture and angular dynamics in sequence using convolutional representations; this is a novel recognition framework that has never been proposed before for either face recognition or any other visual recognition task. The proposed double-deep learning framework includes a long short-term memory (LSTM) recurrent network whose inputs are VGG-Face descriptions that are computed using a VGG-Very-Deep-16 convolutional neural network (CNN). The VGG-16 network uses different face viewpoints rendered from a full light field image, which are organised as a pseudo-video sequence. A comprehensive set of experiments has been conducted with the IST-EURECOM light field face database, for varied and challenging recognition tasks. Results show that the proposed framework achieves superior face recognition performance when compared to the state-of-the-art.

Abstract (translated)

人脸识别因其广泛的应用而日益受到关注,但面对生物特征数据特征的巨大变化,人脸识别仍然具有挑战性。Lenslet光场摄像机最近已成为捕获丰富的空间角信息的重要设备,从而为先进的生物识别系统提供了新的可能性。本文提出了一种基于光场的人脸识别的双深空角学习框架,该框架能够用卷积表示法同时学习纹理和角度动力学,是一种新的人脸识别框架,在人脸识别和其他视觉识别领域都没有提出过。问。提出的双深度学习框架包括一个长短期记忆(lstm)循环网络,其输入是使用vgg-very-deep-16卷积神经网络(cnn)计算的vgg人脸描述。VGG-16网络使用从全光场图像渲染的不同的人脸视点,该图像被组织为一个伪视频序列。利用IST-Eurecom光场人脸数据库进行了一系列综合性实验,以完成各种各样且具有挑战性的识别任务。结果表明,与现有技术相比,该框架具有更好的人脸识别性能。

URL

https://arxiv.org/abs/1805.10078

PDF

https://arxiv.org/pdf/1805.10078.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot