Abstract
Cinemagraph is a unique form of visual media that combines elements of still photography and subtle motion to create a captivating experience. However, the majority of videos generated by recent works lack depth information and are confined to the constraints of 2D image space. In this paper, inspired by significant progress in the field of novel view synthesis (NVS) achieved by 3D Gaussian Splatting (3D-GS), we propose LoopGaussian to elevate cinemagraph from 2D image space to 3D space using 3D Gaussian modeling. To achieve this, we first employ the 3D-GS method to reconstruct 3D Gaussian point clouds from multi-view images of static scenes,incorporating shape regularization terms to prevent blurring or artifacts caused by object deformation. We then adopt an autoencoder tailored for 3D Gaussian to project it into feature space. To maintain the local continuity of the scene, we devise SuperGaussian for clustering based on the acquired features. By calculating the similarity between clusters and employing a two-stage estimation method, we derive an Eulerian motion field to describe velocities across the entire scene. The 3D Gaussian points then move within the estimated Eulerian motion field. Through bidirectional animation techniques, we ultimately generate a 3D Cinemagraph that exhibits natural and seamlessly loopable dynamics. Experiment results validate the effectiveness of our approach, demonstrating high-quality and visually appealing scene generation.
Abstract (translated)
电影图像是将静态照片和微动态运动的元素结合在一起,创造了一种引人入胜的体验。然而,由最近作品生成的多数视频缺乏深度信息,并局限于2D图像空间的限制。在本文中,我们受到3D高斯平滑(3D-GS)在 novel view synthesis(NVS)领域的重大进步的启发,提出了一种将电影图象从2D图像空间提升到3D空间使用3D高斯建模的方法。为了实现这一目标,我们首先使用3D-GS方法从静态场景的多视角图像中重构3D高斯点云,包括形状正则化项,以防止由于物体变形引起的模糊或伪影。然后我们采用一个专为3D高斯设计的自动编码器将其投影到特征空间。为了保持场景的局部连续性,我们根据获得的特征设计 SuperGaussian for clustering。通过计算聚类之间的相似度并使用双阶段估计方法,我们得到一个欧拉运动场,描述了场景中整个空间的瞬时速度。然后,通过双向动画技术,我们最终生成一个3D电影图象,展示了自然和无缝的循环动态。实验结果证实了我们的方法的有效性,表明了高质量和视觉上吸引人的场景生成。
URL
https://arxiv.org/abs/2404.08966