Paper Reading AI Learner

Semi-Synthetic Dataset Augmentation for Application-Specific Gaze Estimation

2023-10-27 20:27:22
Cedric Leblond-Menard, Gabriel Picard-Krashevski, Sofiane Achiche


Although the number of gaze estimation datasets is growing, the application of appearance-based gaze estimation methods is mostly limited to estimating the point of gaze on a screen. This is in part because most datasets are generated in a similar fashion, where the gaze target is on a screen close to camera's origin. In other applications such as assistive robotics or marketing research, the 3D point of gaze might not be close to the camera's origin, meaning models trained on current datasets do not generalize well to these tasks. We therefore suggest generating a textured tridimensional mesh of the face and rendering the training images from a virtual camera at a specific position and orientation related to the application as a mean of augmenting the existing datasets. In our tests, this lead to an average 47% decrease in gaze estimation angular error.

Abstract (translated)

尽管 gaze estimation 数据集的数量正在增加,但基于外观的 gaze 估计方法的应用主要限于在屏幕上估计眼神。这部分是因为大多数数据集是在类似于相机起源的位置和方向上生成的。在辅助机器人学或市场研究等应用中,眼神的 3D 点可能并不接近相机起源,这意味着用当前数据集训练的模型在这些任务上表现不好。因此,我们建议生成一个纹理化的三维人脸网格,并从与应用程序相关的虚拟摄像机在特定位置和方向上渲染训练图像,作为增加现有数据集的 means。在我们的测试中,这导致 gaze 估计角误差平均降低了 47%。



3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot