Abstract
Gaze estimation has become a subject of growing interest in recent research. Most of the current methods rely on single-view facial images as input. Yet, it is hard for these approaches to handle large head angles, leading to potential inaccuracies in the estimation. To address this issue, adding a second-view camera can help better capture eye appearance. However, existing multi-view methods have two limitations. 1) They require multi-view annotations for training, which are expensive. 2) More importantly, during testing, the exact positions of the multiple cameras must be known and match those used in training, which limits the application scenario. To address these challenges, we propose a novel 1-view-to-2-views (1-to-2 views) adaptation solution in this paper, the Unsupervised 1-to-2 Views Adaptation framework for Gaze estimation (UVAGaze). Our method adapts a traditional single-view gaze estimator for flexibly placed dual cameras. Here, the "flexibly" means we place the dual cameras in arbitrary places regardless of the training data, without knowing their extrinsic parameters. Specifically, the UVAGaze builds a dual-view mutual supervision adaptation strategy, which takes advantage of the intrinsic consistency of gaze directions between both views. In this way, our method can not only benefit from common single-view pre-training, but also achieve more advanced dual-view gaze estimation. The experimental results show that a single-view estimator, when adapted for dual views, can achieve much higher accuracy, especially in cross-dataset settings, with a substantial improvement of 47.0%. Project page: this https URL.
Abstract (translated)
目光估计已经成为最近研究的一个热门主题。大多数现有方法依赖于单视图面部图像作为输入。然而,对于这些方法来说处理大的头角度是困难的,导致估计精度存在潜在误差。为了应对这个问题,添加第二个视角的相机可以帮助更好地捕捉眼部特征。然而,现有的多视图方法有两个限制。1)它们需要多视图注释来进行训练,这需要花费大量的资金。2)更重要的是,在测试时,多个相机的精确位置必须知道并与其训练时使用的位置相匹配,这限制了应用场景。为了应对这些挑战,本文提出了一种新颖的1-视-2-视(1-到2视)适应解决方案,即无监督的1-到2视适应框架(UVAGaze)。我们的方法将传统的单视目光估计算法适应于灵活放置的双相机。这里的"灵活"意味着我们将双相机在任何位置放置,而不知道它们的非线性参数。具体来说,UVAGaze构建了一种双视 mutual supervision adaptation strategy,利用了两视之间目光方向的固有一致性。这样,我们的方法不仅可以从常见的单视预训练中受益,还可以实现更高级的双视目光估计。实验结果表明,将目光估计算法适应双视可以实现更高的准确度,尤其是在跨数据集设置中,其准确度提高了47.0%。项目页面:此链接。
URL
https://arxiv.org/abs/2312.15644