Abstract
Diffusion models have emerged as effective tools for generating diverse and high-quality content. However, their capability in high-resolution image generation, particularly for panoramic images, still faces challenges such as visible seams and incoherent transitions. In this paper, we propose TwinDiffusion, an optimized framework designed to address these challenges through two key innovations: Crop Fusion for quality enhancement and Cross Sampling for efficiency optimization. We introduce a training-free optimizing stage to refine the similarity of the adjacent image areas, as well as an interleaving sampling strategy to yield dynamic patches during the cropping process. A comprehensive evaluation is conducted to compare TwinDiffusion with the existing methods, considering factors including coherence, fidelity, compatibility, and efficiency. The results demonstrate the superior performance of our approach in generating seamless and coherent panoramas, setting a new standard in quality and efficiency for panoramic image generation.
Abstract (translated)
扩散模型已成为生成多样且高质量内容的有效工具。然而,其在高分辨率图像生成方面,特别是全景图像,仍然面临着一些挑战,如可见的拼接和不相干的过渡。在本文中,我们提出了TwinDiffusion,一种通过两个关键创新来解决这些挑战的优化框架:裁剪融合用于质量增强和交叉采样用于效率优化。我们引入了一个无需训练的优化阶段来精炼相邻图像区域的相似性,以及一个跨采样策略,以便在裁剪过程中产生动态补丁。对现有方法进行了全面的评估,考虑了包括一致性、忠实性、兼容性和效率在内的因素。结果表明,我们的方法在生成无缝和一致的全景图像方面表现出卓越的性能,为全景图像生成树立了新的质量和技术标准。
URL
https://arxiv.org/abs/2404.19475