Paper Reading AI Learner

Quantum Implicit Neural Representations for 3D Scene Reconstruction and Novel View Synthesis

2025-12-14 13:24:11
Yeray Cordero, Paula Garc\'ia-Molina, Fernando Vilari\~no

Abstract

Implicit neural representations (INRs) have become a powerful paradigm for continuous signal modeling and 3D scene reconstruction, yet classical networks suffer from a well-known spectral bias that limits their ability to capture high-frequency details. Quantum Implicit Representation Networks (QIREN) mitigate this limitation by employing parameterized quantum circuits with inherent Fourier structures, enabling compact and expressive frequency modeling beyond classical MLPs. In this paper, we present Quantum Neural Radiance Fields (Q-NeRF), the first hybrid quantum-classical framework for neural radiance field rendering. Q-NeRF integrates QIREN modules into the Nerfacto backbone, preserving its efficient sampling, pose refinement, and volumetric rendering strategies while replacing selected density and radiance prediction components with quantum-enhanced counterparts. We systematically evaluate three hybrid configurations on standard multi-view indoor datasets, comparing them to classical baselines using PSNR, SSIM, and LPIPS metrics. Results show that hybrid quantum-classical models achieve competitive reconstruction quality under limited computational resources, with quantum modules particularly effective in representing fine-scale, view-dependent appearance. Although current implementations rely on quantum circuit simulators constrained to few-qubit regimes, the results highlight the potential of quantum encodings to alleviate spectral bias in implicit representations. Q-NeRF provides a foundational step toward scalable quantum-enabled 3D scene reconstruction and a baseline for future quantum neural rendering research.

Abstract (translated)

隐式神经表示(INRs)已经成为连续信号建模和三维场景重建的强大范例,然而经典的网络由于众所周知的谱偏差而受限于捕捉高频细节的能力。量子隐式表示网络(QIREN)通过采用具有内嵌傅里叶结构的参数化量子电路来解决这一限制,这使它们能够进行比经典多层感知器(MLP)更为紧凑和丰富的频率建模。 在本文中,我们介绍了量子神经辐射场(Q-NeRF),这是第一个用于神经辐射场渲染的混合量子-经典框架。Q-NeRF将QIREN模块整合到Nerfacto骨干网络中,在保留其高效的采样、姿态细化和体素化渲染策略的同时,用改进后的量子增强组件替换选定的密度和辐射预测部分。 我们系统地评估了三种混合配置在标准多视角室内数据集上的表现,并使用峰值信噪比(PSNR)、结构相似性指数(SSIM)和感知线性判别器(LPIPS)等指标将其与经典基线进行比较。结果显示,在有限计算资源下,混合量子-经典的模型可以实现具有竞争力的重建质量,而量子模块特别擅长表示细粒度、视角依赖性的外观。 尽管当前实施依赖于仅限于少数量子位的应用程序模拟器,但结果突显了量子编码在缓解隐式表示中的谱偏差方面的潜力。Q-NeRF为可扩展的量子赋能三维场景重建提供了一个基础步骤,并且是未来量子神经渲染研究的一个基准点。

URL

https://arxiv.org/abs/2512.12683

PDF

https://arxiv.org/pdf/2512.12683.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot