Abstract
We introduce Uni-Fusion, an universal continuous mapping framework for surfaces, surface properties (color, infrared, etc.) and more (latent features in CLIP embedding space, etc.). We propose the first Universal Implicit Encoding model that supports encoding of both geometry and various types of properties (RGB, infrared, feature and etc.) without the need for any training. Based on that, our framework divides the point cloud into regular grid voxels and produces a latent feature in each voxel to form a Latent Implicit Map (LIM) for geometries and arbitrary properties. Then, by fusing a Local LIM of new frame to Global LIM, an incremental reconstruction is approached. Encoded with corresponding types of data, our Latent Implicit Map is capable to generate continuous surfaces, surface properties fields, surface feature fields and any other possible options. To demonstrate the capabilities of our model, we implement three applications: (1) incremental reconstruction for surfaces and color (2) 2D-to-3D fabricated properties transfers (3) open-vocabulary scene understanding by producing a text CLIP feature field on surfaces. We evaluate Uni-Fusion by comparing in corresponding applications, from which, Uni-Fusion shows high flexibility to various of application while performing best or competitive. The project page of Uni-Fusion is available at this https URL
Abstract (translated)
我们引入了 Uni-Fusion,一个适用于表面、表面属性(颜色、红外等)以及更多的 universal 连续映射框架。我们提出了第一个 universal implicit 编码模型,该模型无需任何训练即可支持几何体和任意属性的编码(如 RGB、红外、特征等)。基于该模型,我们将其点云按 regular grid voxels 分割成单个的隐式映射(LIM)单元,并在每个 voxel 中产生隐式特征,以形成几何体和任意属性的隐式映射(LIM)。然后,通过将新帧的 local LIM 与 global LIM 融合,增量重建被 approached。与相应的数据编码,我们的隐式 implicit 映射可以生成连续的表面、表面属性场、表面特征场和任何其他可能的选择。为了展示我们模型的能力,我们实现了三个应用:(1)增量重建用于表面和颜色;(2)2D 到 3D 制造属性转移;(3)通过在表面上生成文本 CLIP 特征场,实现开放词汇场景理解。我们比较了相应的应用,Uni-Fusion 在表现最佳或竞争环境中表现出高灵活性。Uni-Fusion 项目的页面在此 https URL 上可用。
URL
https://arxiv.org/abs/2303.12678