Understanding Unconventional Preprocessors in Deep Convolutional Neural Networks for Face Identification

Abstract
Abstract (translated)
URL
PDF

Abstract

Deep networks have achieved huge successes in application domains like object and face recognition. The performance gain is attributed to different facets of the network architecture such as: depth of the convolutional layers, activation function, pooling, batch normalization, forward and back propagation and many more. However, very little emphasis is made on the preprocessors. Therefore, in this paper, the network's preprocessing module is varied across different preprocessing approaches while keeping constant other facets of the network architecture, to investigate the contribution preprocessing makes to the network. Commonly used preprocessors are the data augmentation and normalization and are termed conventional preprocessors. Others are termed the unconventional preprocessors, they are: color space converters; HSV, CIE L*a*b* and YCBCR, grey-level resolution preprocessors; full-based and plane-based image quantization, illumination normalization and insensitive feature preprocessing using: histogram equalization (HE), local contrast normalization (LN) and complete face structural pattern (CFSP). To achieve fixed network parameters, CNNs with transfer learning is employed. Knowledge from the high-level feature vectors of the Inception-V3 network is transferred to offline preprocessed LFW target data; and features trained using the SoftMax classifier for face identification. The experiments show that the discriminative capability of the deep networks can be improved by preprocessing RGB data with HE, full-based and plane-based quantization, rgbGELog, and YCBCR, preprocessors before feeding it to CNNs. However, for best performance, the right setup of preprocessed data with augmentation and/or normalization is required. The plane-based image quantization is found to increase the homogeneity of neighborhood pixels and utilizes reduced bit depth for better storage efficiency.

Abstract (translated)

深层网络在对象识别和人脸识别等应用领域取得了巨大的成功。性能的提高归因于网络体系结构的不同方面，例如：卷积层的深度、激活函数、池、批处理规范化、前向和后向传播等等。但是，很少强调预处理器。因此，本文在保持网络结构其他方面不变的同时，通过不同的预处理方法改变网络的预处理模块，来研究预处理对网络的贡献。常用的预处理器是数据扩充和规范化，称为传统的预处理器。其他的被称为非常规的预处理器，它们是：颜色空间转换器；hsv，cie l*a*b*和ycbcr，灰度分辨率预处理器；全基和平面图像量化，照明标准化和不敏感的特征预处理，使用：直方图均衡（he），局部对比标准化（ln）和完全人脸结构模式。为了实现固定的网络参数，采用了带转移学习的CNN。将来自Inception-v3网络高级特征向量的知识转移到离线预处理的LFW目标数据中，并使用SoftMax分类器训练特征进行人脸识别。实验表明，在将RGB数据输入CNN网络之前，利用He、全基和平面量化、rgbgelo和ycbcr预处理器对其进行预处理，可以提高深网的识别能力。但是，为了获得最佳性能，需要对预处理数据进行正确的设置，并进行扩充和/或规范化。基于平面的图像量化可以提高邻域像素的均匀性，并利用降低的比特深度来提高存储效率。

URL

https://arxiv.org/abs/1904.00815

PDF

https://arxiv.org/pdf/1904.00815.pdf