External Knowledge Enhanced 3D Scene Generation from Sketch

Abstract
Abstract (translated)
URL
PDF

Abstract

Generating realistic 3D scenes is challenging due to the complexity of room layouts and object geometries.We propose a sketch based knowledge enhanced diffusion architecture (SEK) for generating customized, diverse, and plausible 3D scenes. SEK conditions the denoising process with a hand-drawn sketch of the target scene and cues from an object relationship knowledge base. We first construct an external knowledge base containing object relationships and then leverage knowledge enhanced graph reasoning to assist our model in understanding hand-drawn sketches. A scene is represented as a combination of 3D objects and their relationships, and then incrementally diffused to reach a Gaussian distribution.We propose a 3D denoising scene transformer that learns to reverse the diffusion process, conditioned by a hand-drawn sketch along with knowledge cues, to regressively generate the scene including the 3D object instances as well as their layout. Experiments on the 3D-FRONT dataset show that our model improves FID, CKL by 17.41%, 37.18% in 3D scene generation and FID, KID by 19.12%, 20.06% in 3D scene completion compared to the nearest competitor DiffuScene.

Abstract (translated)

生成逼真的3D场景具有复杂的空间布局和物体几何形状的复杂性。我们提出了一种基于手绘场景的增强扩散架构（SEK）用于生成定制的、多样化和逼真的3D场景。SEK通过目标场景的手绘草图和物体关系知识库中的提示来约束去噪过程。我们首先构建了一个包含物体关系的外部知识库，然后利用增强图推理来协助我们的模型理解手绘草图。场景被表示为3D物体及其关系的组合，然后通过逐层扩散达到高斯分布。我们提出了一种3D去噪场景变换器，它通过手绘草图和知识提示来学习反扩散过程，以递归地生成场景，包括3D物体实例及其布局。在3D-FRONT数据集的实验中，我们的模型将FID和CKL提高了17.41%和37.18%，而3D场景生成和完成分别提高了19.12%和20.06%，相对于最近的竞争对手DiffuScene。

URL

https://arxiv.org/abs/2403.14121

PDF

https://arxiv.org/pdf/2403.14121.pdf

External Knowledge Enhanced 3D Scene Generation from Sketch

Abstract

Abstract (translated)

URL

PDF Copy

PDF