Abstract
Existing neural radiance fields (NeRF)-based novel view synthesis methods for large-scale outdoor scenes are mainly built on a single altitude. Moreover, they often require a priori camera shooting height and scene scope, leading to inefficient and impractical applications when camera altitude changes. In this work, we propose an end-to-end framework, termed AG-NeRF, and seek to reduce the training cost of building good reconstructions by synthesizing free-viewpoint images based on varying altitudes of scenes. Specifically, to tackle the detail variation problem from low altitude (drone-level) to high altitude (satellite-level), a source image selection method and an attention-based feature fusion approach are developed to extract and fuse the most relevant features of target view from multi-height images for high-fidelity rendering. Extensive experiments demonstrate that AG-NeRF achieves SOTA performance on 56 Leonard and Transamerica benchmarks and only requires a half hour of training time to reach the competitive PSNR as compared to the latest BungeeNeRF.
Abstract (translated)
现有的基于神经辐射场(NeRF)的大规模户外场景的新型视图合成方法主要基于单个高度。此外,它们通常需要先验相机拍摄高度和场景范围,导致当相机高度发生变化时,应用变得低效和不实际。在本文中,我们提出了一个端到端的框架,称为AG-NeRF,旨在通过根据场景不同高度合成自由视点图像来降低构建良好重构的成本。具体来说,为了解决从低高度(无人机水平)到高高度(卫星水平)的详细变化问题,我们开发了源图像选择方法和一种基于注意力的特征融合方法,以提取和融合目标视图的高保真度渲染中最相关的特征。大量实验证明,AG-NeRF在56个Leonard和Transamerica基准测试中的性能达到最佳,与最新的BungeeNeRF相比,训练时间仅为一半小时。
URL
https://arxiv.org/abs/2404.11897