Abstract
In this paper, we propose a method to segment and recover a static, clean background and multiple 360$^\circ$ objects from observations of scenes at different timestamps. Recent works have used neural radiance fields to model 3D scenes and improved the quality of novel view synthesis, while few studies have focused on modeling the invisible or occluded parts of the training images. These under-reconstruction parts constrain both scene editing and rendering view selection, thereby limiting their utility for synthetic data generation for downstream tasks. Our basic idea is that, by observing the same set of objects in various arrangement, so that parts that are invisible in one scene may become visible in others. By fusing the visible parts from each scene, occlusion-free rendering of both background and foreground objects can be achieved. We decompose the multi-scene fusion task into two main components: (1) objects/background segmentation and alignment, where we leverage point cloud-based methods tailored to our novel problem formulation; (2) radiance fields fusion, where we introduce visibility field to quantify the visible information of radiance fields, and propose visibility-aware rendering for the fusion of series of scenes, ultimately obtaining clean background and 360$^\circ$ object rendering. Comprehensive experiments were conducted on synthetic and real datasets, and the results demonstrate the effectiveness of our method.
Abstract (translated)
在本文中,我们提出了一种基于不同时间戳观察场景的方法来分割和恢复静态、干净的背景以及多个360$^\circ$的对象。最近的工作已经使用神经辐射场来建模3D场景,提高了新视图合成质量,而很少的研究集中于建模训练图像中不可见或被遮挡的部分。这些未重建部分既约束了场景编辑,也约束了渲染视图选择,因此限制了它们在下游任务中生成仿真的可用性。我们的基本想法是,通过观察同一组物体在不同排列,使一个场景中看不见的部分在另一个场景中可能会变得可见。通过将每个场景中的可见部分融合,可以实现背景和前景对象的透明渲染。我们将多场景融合任务分解为两个主要组件:(1)物体/背景分割和对齐,我们利用专门为我们的新问题形式化定义的点云为基础的方法;(2)辐射场融合,我们引入可见度场来量化辐射场的可见信息,并提出了对序列场景的融合的可见度感知渲染,最终获得干净的背景和360$^\circ$对象的渲染。我们对合成和真实数据集进行了全面的实验,结果表明,我们的方法的有效性得到了充分证明。
URL
https://arxiv.org/abs/2404.09426