Abstract
Non-invasive gaze estimation methods usually regress gaze directions directly from a single face or eye image. However, due to important variabilities in eye shapes and inner eye structures amongst individuals, universal models obtain limited accuracies and their output usually exhibit high variance as well as biases which are subject dependent. Therefore, increasing accuracy is usually done through calibration, allowing gaze predictions for a subject to be mapped to his/her actual gaze. In this paper, we introduce a novel image differential method for gaze estimation. We propose to directly train a differential convolutional neural network to predict the gaze differences between two eye input images of the same subject. Then, given a set of subject specific calibration images, we can use the inferred differences to predict the gaze direction of a novel eye sample. The assumption is that by allowing the comparison between two eye images, annoyance factors (alignment, eyelid closing, illumination perturbations) which usually plague single image prediction methods can be much reduced, allowing better prediction altogether. Experiments on 3 public datasets validate our approach which constantly outperforms state-of-the-art methods even when using only one calibration sample or when the latter methods are followed by subject specific gaze adaptation.
Abstract (translated)
非侵入性注视估计方法通常直接从一张脸或眼睛图像中回归注视方向。然而,由于个体的眼睛形状和内部眼睛结构的重要变化,通用模型获得的精度有限,其输出通常表现出高方差以及受个体影响的偏差。因此,提高准确度通常是通过校准来完成的,这样可以将一个对象的注视预测映射到他/她的实际注视。本文介绍了一种新的视觉估计的图像差分方法。我们建议直接训练一个差分卷积神经网络来预测同一受试者的两个眼睛输入图像之间的注视差异。然后,在给定一组特定于被摄对象的校准图像的情况下,我们可以利用推断出的差异来预测一个新的眼睛样本的注视方向。假设通过比较两个眼睛图像,可以大大减少困扰单图像预测方法的烦扰因素(对齐、眼睑闭合、照明干扰),从而实现更好的预测。对3个公共数据集进行的实验验证了我们的方法,即使只使用一个校准样本,或在后一种方法之后再进行特定于受试者的注视适应时,我们的方法始终优于最先进的方法。
URL
https://arxiv.org/abs/1904.09459