Paper Reading AI Learner

Cross-Dataset Gaze Estimation by Evidential Inter-intra Fusion

2024-09-07 08:53:17
Shijing Wang, Yaping Huang, Jun Xie, YiTian, Feng Chen, Zhepeng Wang

Abstract

Achieving accurate and reliable gaze predictions in complex and diverse environments remains challenging. Fortunately, it is straightforward to access diverse gaze datasets in real-world applications. We discover that training these datasets jointly can significantly improve the generalization of gaze estimation, which is overlooked in previous works. However, due to the inherent distribution shift across different datasets, simply mixing multiple dataset decreases the performance in the original domain despite gaining better generalization abilities. To address the problem of ``cross-dataset gaze estimation'', we propose a novel Evidential Inter-intra Fusion EIF framework, for training a cross-dataset model that performs well across all source and unseen domains. Specifically, we build independent single-dataset branches for various datasets where the data space is partitioned into overlapping subspaces within each dataset for local regression, and further create a cross-dataset branch to integrate the generalizable features from single-dataset branches. Furthermore, evidential regressors based on the Normal and Inverse-Gamma (NIG) distribution are designed to additionally provide uncertainty estimation apart from predicting gaze. Building upon this foundation, our proposed framework achieves both intra-evidential fusion among multiple local regressors within each dataset and inter-evidential fusion among multiple branches by Mixture \textbfof Normal Inverse-Gamma (MoNIG distribution. Experiments demonstrate that our method consistently achieves notable improvements in both source domains and unseen domains.

Abstract (translated)

实现复杂且多样环境中准确可靠的眼神预测仍然具有挑战性。幸运的是,在现实应用中访问多样眼神数据集是直观的。我们发现,联合训练这些数据集可以显著提高眼神估计的泛化能力,这在之前的论文中被忽视了。然而,由于不同数据集中的固有分布差异,即使获得了更好的泛化能力,简单地将多个数据集混合也会导致在原始领域中的性能下降。为解决“跨数据集眼神估计”问题,我们提出了一个新的证据互信融合EIF框架,用于在所有源域和未见域中表现良好的跨数据集模型。具体来说,我们为各种数据集构建了独立的数据集分支,其中数据空间在每个数据集中被划分为重叠子空间进行局部回归,并进一步创建了跨数据集分支,将单数据集分支的通用特征整合到一起。此外,基于Normal和Inverse-Gamma(NIG)分布的证据回归器被设计为除了预测眼神外还提供不确定性估计。基于这个基础,我们提出的框架实现了每个数据集中的内部证据融合和跨分支之间的证据融合。实验证明,我们的方法在源域和未见域上都取得了显著的改进。

URL

https://arxiv.org/abs/2409.04766

PDF

https://arxiv.org/pdf/2409.04766.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot