Paper Reading AI Learner

ViFu: Multiple 360$^circ$ Objects Reconstruction with Clean Background via Visible Part Fusion

2024-04-15 02:44:23
Tianhan Xu, Takuya Ikeda, Koichi Nishiwaki

Abstract

In this paper, we propose a method to segment and recover a static, clean background and multiple 360$^\circ$ objects from observations of scenes at different timestamps. Recent works have used neural radiance fields to model 3D scenes and improved the quality of novel view synthesis, while few studies have focused on modeling the invisible or occluded parts of the training images. These under-reconstruction parts constrain both scene editing and rendering view selection, thereby limiting their utility for synthetic data generation for downstream tasks. Our basic idea is that, by observing the same set of objects in various arrangement, so that parts that are invisible in one scene may become visible in others. By fusing the visible parts from each scene, occlusion-free rendering of both background and foreground objects can be achieved. We decompose the multi-scene fusion task into two main components: (1) objects/background segmentation and alignment, where we leverage point cloud-based methods tailored to our novel problem formulation; (2) radiance fields fusion, where we introduce visibility field to quantify the visible information of radiance fields, and propose visibility-aware rendering for the fusion of series of scenes, ultimately obtaining clean background and 360$^\circ$ object rendering. Comprehensive experiments were conducted on synthetic and real datasets, and the results demonstrate the effectiveness of our method.

Abstract (translated)

在本文中,我们提出了一种基于不同时间戳观察场景的方法来分割和恢复静态、干净的背景以及多个360$^\circ$的对象。最近的工作已经使用神经辐射场来建模3D场景,提高了新视图合成质量,而很少的研究集中于建模训练图像中不可见或被遮挡的部分。这些未重建部分既约束了场景编辑,也约束了渲染视图选择,因此限制了它们在下游任务中生成仿真的可用性。我们的基本想法是,通过观察同一组物体在不同排列,使一个场景中看不见的部分在另一个场景中可能会变得可见。通过将每个场景中的可见部分融合,可以实现背景和前景对象的透明渲染。我们将多场景融合任务分解为两个主要组件:(1)物体/背景分割和对齐,我们利用专门为我们的新问题形式化定义的点云为基础的方法;(2)辐射场融合,我们引入可见度场来量化辐射场的可见信息,并提出了对序列场景的融合的可见度感知渲染,最终获得干净的背景和360$^\circ$对象的渲染。我们对合成和真实数据集进行了全面的实验,结果表明,我们的方法的有效性得到了充分证明。

URL

https://arxiv.org/abs/2404.09426

PDF

https://arxiv.org/pdf/2404.09426.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot