Abstract
We propose a method for editing NeRF scenes with text-instructions. Given a NeRF of a scene and the collection of images used to reconstruct it, our method uses an image-conditioned diffusion model (InstructPix2Pix) to iteratively edit the input images while optimizing the underlying scene, resulting in an optimized 3D scene that respects the edit instruction. We demonstrate that our proposed method is able to edit large-scale, real-world scenes, and is able to accomplish more realistic, targeted edits than prior work.
Abstract (translated)
我们提出了一种方法,用于以文本指令编辑NeRF场景。给定一个场景的NeRF图像和用于重建它的一组图像,我们使用一种图像适应扩散模型(InstructPix2Pix)迭代地编辑输入图像,同时优化底层场景,最终生成符合编辑指令的优化的3D场景。我们证明了我们提出的方法能够编辑大规模的现实世界场景,并且能够比先前的工作实现更逼真、有针对性的编辑。
URL
https://arxiv.org/abs/2303.12789