Paper Reading AI Learner

Expression-aware video inpainting for HMD removal in XR applications

2024-01-25 12:32:21
Fatemeh Ghorbani Lohesara, Karen Egiazarian, Sebastian Knorr

Abstract

Head-mounted displays (HMDs) serve as indispensable devices for observing extended reality (XR) environments and virtual content. However, HMDs present an obstacle to external recording techniques as they block the upper face of the user. This limitation significantly affects social XR applications, specifically teleconferencing, where facial features and eye gaze information play a vital role in creating an immersive user experience. In this study, we propose a new network for expression-aware video inpainting for HMD removal (EVI-HRnet) based on generative adversarial networks (GANs). Our model effectively fills in missing information with regard to facial landmarks and a single occlusion-free reference image of the user. The framework and its components ensure the preservation of the user's identity across frames using the reference frame. To further improve the level of realism of the inpainted output, we introduce a novel facial expression recognition (FER) loss function for emotion preservation. Our results demonstrate the remarkable capability of the proposed framework to remove HMDs from facial videos while maintaining the subject's facial expression and identity. Moreover, the outputs exhibit temporal consistency along the inpainted frames. This lightweight framework presents a practical approach for HMD occlusion removal, with the potential to enhance various collaborative XR applications without the need for additional hardware.

Abstract (translated)

头戴显示器(HMDs)对于观察扩展现实(XR)环境和虚拟内容至关重要。然而,HMDs 对外部录制技术构成了障碍,因为它们挡住了用户的 upper face。这一限制大大影响了社交 XR 应用,特别是视频会议,因为在创建沉浸式用户体验的过程中,面部特征和眼动信息至关重要。在这项研究中,我们提出了一个基于生成对抗网络(GANs)的表达式注意视频修复(EVI-HRnet)新网络。我们的模型有效地通过修复面部特征和用户单张不遮挡的参考图像来填补缺失信息。该框架及其组件确保在帧之间保留用户的身份。为了进一步提高修复输出后的现实水平,我们引入了一种新的面部表情识别(FER)损失函数,用于情感保留。我们的结果表明,与修复后的面部视频相比,该建议框架可以有效地从面部视频中移除 HMD,同时保留主体的面部表情和身份。此外,修复后的输出在修复帧之间具有时间一致性。这个轻量级框架为 HMD 遮挡移除提供了一个实际的方法,具有不需要额外硬件来增强各种协作 XR 应用程序的潜力。

URL

https://arxiv.org/abs/2401.14136

PDF

https://arxiv.org/pdf/2401.14136.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot