SemCity: Semantic Scene Generation with Triplane Diffusion

Abstract
Abstract (translated)
URL
PDF

Abstract

We present "SemCity," a 3D diffusion model for semantic scene generation in real-world outdoor environments. Most 3D diffusion models focus on generating a single object, synthetic indoor scenes, or synthetic outdoor scenes, while the generation of real-world outdoor scenes is rarely addressed. In this paper, we concentrate on generating a real-outdoor scene through learning a diffusion model on a real-world outdoor dataset. In contrast to synthetic data, real-outdoor datasets often contain more empty spaces due to sensor limitations, causing challenges in learning real-outdoor distributions. To address this issue, we exploit a triplane representation as a proxy form of scene distributions to be learned by our diffusion model. Furthermore, we propose a triplane manipulation that integrates seamlessly with our triplane diffusion model. The manipulation improves our diffusion model's applicability in a variety of downstream tasks related to outdoor scene generation such as scene inpainting, scene outpainting, and semantic scene completion refinements. In experimental results, we demonstrate that our triplane diffusion model shows meaningful generation results compared with existing work in a real-outdoor dataset, SemanticKITTI. We also show our triplane manipulation facilitates seamlessly adding, removing, or modifying objects within a scene. Further, it also enables the expansion of scenes toward a city-level scale. Finally, we evaluate our method on semantic scene completion refinements where our diffusion model enhances predictions of semantic scene completion networks by learning scene distribution. Our code is available at this https URL.

Abstract (translated)

我们提出了一个名为“SemCity”的3D扩散模型，用于在现实世界户外环境中生成语义场景。大多数3D扩散模型集中于生成单个物体、合成室内场景或合成室外场景，而现实世界户外场景的生成很少被关注。在本文中，我们专注于通过在现实世界户外数据集中学习扩散模型来生成真实户外场景。与合成数据相比，现实世界户外数据集通常包含更多的空旷空间，导致学习真实户外分布具有挑战性。为了解决这个问题，我们利用三平面表示作为一种场景分布的代理形式，作为我们的扩散模型可以学习的三平面操作。此外，我们还提出了一种与三平面扩散模型无缝集成的三平面操作。操作改善了我们的扩散模型在户外场景生成任务中的适用性，例如场景修复、场景去修复和语义场景完成 refinements。在实验结果中，我们证明了我们的三平面扩散模型在真实户外数据集上的生成结果与现有工作相比具有实际意义，即使在语义KITTI数据集上也是如此。我们还证明了我们的三平面操作使场景内对象在不同场景之间的添加、删除或修改变得更加容易。此外，它还使场景可以扩展到城市级别。最后，我们在语义场景完成 refinements 上评估我们的方法，我们的扩散模型通过学习场景分布增强了语义场景完成网络的预测。我们的代码可在此处访问：https://www.xxxxxx.com/

URL

https://arxiv.org/abs/2403.07773

PDF

https://arxiv.org/pdf/2403.07773.pdf

SemCity: Semantic Scene Generation with Triplane Diffusion

Abstract

Abstract (translated)

URL

PDF Copy

PDF