Abstract
We present GenN2N, a unified NeRF-to-NeRF translation framework for various NeRF translation tasks such as text-driven NeRF editing, colorization, super-resolution, inpainting, etc. Unlike previous methods designed for individual translation tasks with task-specific schemes, GenN2N achieves all these NeRF editing tasks by employing a plug-and-play image-to-image translator to perform editing in the 2D domain and lifting 2D edits into the 3D NeRF space. Since the 3D consistency of 2D edits may not be assured, we propose to model the distribution of the underlying 3D edits through a generative model that can cover all possible edited NeRFs. To model the distribution of 3D edited NeRFs from 2D edited images, we carefully design a VAE-GAN that encodes images while decoding NeRFs. The latent space is trained to align with a Gaussian distribution and the NeRFs are supervised through an adversarial loss on its renderings. To ensure the latent code does not depend on 2D viewpoints but truly reflects the 3D edits, we also regularize the latent code through a contrastive learning scheme. Extensive experiments on various editing tasks show GenN2N, as a universal framework, performs as well or better than task-specific specialists while possessing flexible generative power. More results on our project page: this https URL
Abstract (translated)
我们提出了GenN2N,一个统一的NeRF到NeRF翻译框架,用于各种NeRF翻译任务,如文本驱动的NeRF编辑、颜色化、超分辨率等。与之前为单独任务设计的具有任务特定方案的方法不同,GenN2N通过使用可插拔的图像到图像的翻译器在二维领域执行编辑并将2D编辑浮动到三维NeRF空间中,从而实现所有这些NeRF编辑任务。由于二维编辑的3D一致性可能无法保证,我们提出通过一个生成模型建模底层3D编辑的分布。为了从2D编辑图像中建模3D编辑的分布,我们仔细设计了一个VAE-GAN,它在解码NeRF的同时编码图像。隐空间通过归一化高斯分布进行训练,NeRFs通过在其渲染上应用对抗损失进行监督。为了确保隐码不依赖于2D视点,而是真正反映了3D编辑,我们还通过对比学习方案对隐码进行正则化。在各种编辑任务上的广泛实验表明,GenN2N作为一个通用框架,表现出色或者与任务特定专家相当,同时具有灵活的生成能力。更多结果请查看我们的项目页面:https:// this URL。
URL
https://arxiv.org/abs/2404.02788