Abstract
Recently, with the development of Neural Radiance Fields and Gaussian Splatting, 3D reconstruction techniques have achieved remarkably high fidelity. However, the latent representations learnt by these methods are highly entangled and lack interpretability. In this paper, we propose a novel part-aware compositional reconstruction method, called GaussianBlock, that enables semantically coherent and disentangled representations, allowing for precise and physical editing akin to building blocks, while simultaneously maintaining high fidelity. Our GaussianBlock introduces a hybrid representation that leverages the advantages of both primitives, known for their flexible actionability and editability, and 3D Gaussians, which excel in reconstruction quality. Specifically, we achieve semantically coherent primitives through a novel attention-guided centering loss derived from 2D semantic priors, complemented by a dynamic splitting and fusion strategy. Furthermore, we utilize 3D Gaussians that hybridize with primitives to refine structural details and enhance fidelity. Additionally, a binding inheritance strategy is employed to strengthen and maintain the connection between the two. Our reconstructed scenes are evidenced to be disentangled, compositional, and compact across diverse benchmarks, enabling seamless, direct and precise editing while maintaining high quality.
Abstract (translated)
近年来,随着神经元辐射场和Gaussian合成的发展,3D 重建技术取得了显著的高保真度。然而,这些方法学所学习的潜在表示高度纠缠且缺乏可解释性。在本文中,我们提出了一种新颖的部分感知组合重建方法,称为GaussianBlock,该方法实现了语义上可解释和去中心化的表示,允许用户精确地编辑和物理地编辑类似于构建块,同时保持高保真度。 我们的GaussianBlock引入了一种结合原始数据和3D高斯信息的混合表示,这是通过对2D语义 prior 的自适应加权获得的。此外,我们采用了一种动态的分裂和合并策略,以实现语义上可解释的原始表示。此外,我们还利用与原始数据混合的3D高斯来优化结构和增强保真度。 此外,我们还采用了一种绑定继承策略来加强和维持这两个之间的联系。通过在各种基准测试中进行验证,我们的重构场景被证明具有可解释性、可组合性和紧凑性,这使得用户可以在不降低质量的情况下实现直接、精确和无缝的编辑。
URL
https://arxiv.org/abs/2410.01535