Abstract
Omnidirectional images (ODIs) are commonly used in real-world visual tasks, and high-resolution ODIs help improve the performance of related visual tasks. Most existing super-resolution methods for ODIs use end-to-end learning strategies, resulting in inferior realness of generated images and a lack of effective out-of-domain generalization capabilities in training methods. Image generation methods represented by diffusion model provide strong priors for visual tasks and have been proven to be effectively applied to image restoration tasks. Leveraging the image priors of the Stable Diffusion (SD) model, we achieve omnidirectional image super-resolution with both fidelity and realness, dubbed as OmniSSR. Firstly, we transform the equirectangular projection (ERP) images into tangent projection (TP) images, whose distribution approximates the planar image domain. Then, we use SD to iteratively sample initial high-resolution results. At each denoising iteration, we further correct and update the initial results using the proposed Octadecaplex Tangent Information Interaction (OTII) and Gradient Decomposition (GD) technique to ensure better consistency. Finally, the TP images are transformed back to obtain the final high-resolution results. Our method is zero-shot, requiring no training or fine-tuning. Experiments of our method on two benchmark datasets demonstrate the effectiveness of our proposed method.
Abstract (translated)
定向图像(ODIs)通常用于现实世界的视觉任务,而高分辨率ODIs有助于提高相关视觉任务的性能。大多数现有的超分辨率方法ODIs都使用端到端学习策略,导致生成的图像的现实性较差,训练方法中缺乏有效的跨域通用能力。代表扩散模型的图像生成方法具有很强的对视觉任务的优先级,已经被证明有效地应用于图像修复任务。通过利用Stable Diffusion(SD)模型的图像先验,我们实现了一种既有保真度又有真实感的 omnidirectional 图像超分辨率,被称为OmniSSR。首先,我们将等角投影(ERP)图像转换为切线投影(TP)图像,其分布近似于平面图像域。然后,我们使用SD逐迭代采样初始高分辨率结果。在每一次去噪迭代中,我们进一步使用所提出的八面体切线信息交互(OTII)和梯度分解(GD)技术纠正和更新初始结果,确保更好的一致性。最后,TP图像转换为获得最终高分辨率结果。我们的方法是零散的,不需要训练或微调。在两个基准数据集上的实验表明,我们提出的方法的有效性。
URL
https://arxiv.org/abs/2404.10312