Paper Reading AI Learner

Anisotropic Neural Representation Learning for High-Quality Neural Rendering

2023-11-30 07:29:30
Y.Wang, J. Xu, Y. Zeng, Y. Gong

Abstract

Neural radiance fields (NeRFs) have achieved impressive view synthesis results by learning an implicit volumetric representation from multi-view images. To project the implicit representation into an image, NeRF employs volume rendering that approximates the continuous integrals of rays as an accumulation of the colors and densities of the sampled points. Although this approximation enables efficient rendering, it ignores the direction information in point intervals, resulting in ambiguous features and limited reconstruction quality. In this paper, we propose an anisotropic neural representation learning method that utilizes learnable view-dependent features to improve scene representation and reconstruction. We model the volumetric function as spherical harmonic (SH)-guided anisotropic features, parameterized by multilayer perceptrons, facilitating ambiguity elimination while preserving the rendering efficiency. To achieve robust scene reconstruction without anisotropy overfitting, we regularize the energy of the anisotropic features during training. Our method is flexiable and can be plugged into NeRF-based frameworks. Extensive experiments show that the proposed representation can boost the rendering quality of various NeRFs and achieve state-of-the-art rendering performance on both synthetic and real-world scenes.

Abstract (translated)

神经辐射场(NeRFs)通过从多视角图像中学习隐式体积表示,实现了令人印象深刻的视图合成结果。为了将隐式表示投影到图像中,NeRF采用体积渲染,将射线的连续积分近似为采样点的颜色和密度的累积。尽管这个近似能够实现高效的渲染,但它忽略了点区间中的方向信息,导致模糊的特征和有限的重建质量。在本文中,我们提出了一个自适应神经表示学习方法,利用可学习的多视角视觉特征来提高场景表示和重建。我们将体积函数建模为球面余弦(SH)-指导的异质特征,由多层感知器参数化,促进模糊消除,同时保留渲染效率。为了实现没有偏移的稳健场景重建,我们在训练过程中对异质特征的能量进行正则化。我们的方法是灵活的,可以将其集成到NeRF基于的框架中。大量的实验证明,所提出的表示可以提高各种NeRF的渲染质量,实现真实世界和合成场景的最先进的渲染性能。

URL

https://arxiv.org/abs/2311.18311

PDF

https://arxiv.org/pdf/2311.18311.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot