Paper Reading AI Learner

ResVR: Joint Rescaling and Viewport Rendering of Omnidirectional Images

2024-04-25 17:59:46
Weiqi Li, Shijie Zhao, Bin Chen, Xinhua Cheng, Junlin Li, Li Zhang, Jian Zhang

Abstract

With the advent of virtual reality technology, omnidirectional image (ODI) rescaling techniques are increasingly embraced for reducing transmitted and stored file sizes while preserving high image quality. Despite this progress, current ODI rescaling methods predominantly focus on enhancing the quality of images in equirectangular projection (ERP) format, which overlooks the fact that the content viewed on head mounted displays (HMDs) is actually a rendered viewport instead of an ERP image. In this work, we emphasize that focusing solely on ERP quality results in inferior viewport visual experiences for users. Thus, we propose ResVR, which is the first comprehensive framework for the joint Rescaling and Viewport Rendering of ODIs. ResVR allows obtaining LR ERP images for transmission while rendering high-quality viewports for users to watch on HMDs. In our ResVR, a novel discrete pixel sampling strategy is developed to tackle the complex mapping between the viewport and ERP, enabling end-to-end training of ResVR pipeline. Furthermore, a spherical pixel shape representation technique is innovatively derived from spherical differentiation to significantly improve the visual quality of rendered viewports. Extensive experiments demonstrate that our ResVR outperforms existing methods in viewport rendering tasks across different fields of view, resolutions, and view directions while keeping a low transmission overhead.

Abstract (translated)

随着虚拟现实技术的的出现,全方向图像(ODI)缩放技术逐渐受到欢迎,用于减小传输和存储文件的大小,同时保留高图像质量。尽管如此,目前ODI缩放方法主要集中在增强等角投影(ERP)格式下图像的质量,而忽略了用户在头戴显示器(HMD)上看到的实际内容是一个渲染视图而不是ERP图像。在本文中,我们强调,仅关注ERP质量会导致用户获得劣质视图视觉体验。因此,我们提出了ResVR,这是第一个全面框架,旨在实现ODI的联合缩放和视图渲染。ResVR允许在传输过程中获得高光晕(LR)ERP图像,同时为用户在HMD上观看高质量视图。在我们的ResVR中,我们开发了一种新颖的离散像素采样策略,以解决视图和ERP之间的复杂映射,实现ResVR管道的端到端训练。此外,我们还创新地从球形差分中得到球形像素形状表示技术,显著提高了渲染视图的质量。大量实验证明,我们的ResVR在不同的视角、分辨率和对角线范围内,相较于现有方法在视图渲染任务中具有优异的性能,同时保持较低的传输开销。

URL

https://arxiv.org/abs/2404.16825

PDF

https://arxiv.org/pdf/2404.16825.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot