Efficient Data-driven Scene Simulation using Robotic Surgery Videos via Physics-embedded 3D Gaussians

Abstract
Abstract (translated)
URL
PDF

Abstract

Surgical scene simulation plays a crucial role in surgical education and simulator-based robot learning. Traditional approaches for creating these environments with surgical scene involve a labor-intensive process where designers hand-craft tissues models with textures and geometries for soft body simulations. This manual approach is not only time-consuming but also limited in the scalability and realism. In contrast, data-driven simulation offers a compelling alternative. It has the potential to automatically reconstruct 3D surgical scenes from real-world surgical video data, followed by the application of soft body physics. This area, however, is relatively uncharted. In our research, we introduce 3D Gaussian as a learnable representation for surgical scene, which is learned from stereo endoscopic video. To prevent over-fitting and ensure the geometrical correctness of these scenes, we incorporate depth supervision and anisotropy regularization into the Gaussian learning process. Furthermore, we apply the Material Point Method, which is integrated with physical properties, to the 3D Gaussians to achieve realistic scene deformations. Our method was evaluated on our collected in-house and public surgical videos datasets. Results show that it can reconstruct and simulate surgical scenes from endoscopic videos efficiently-taking only a few minutes to reconstruct the surgical scene-and produce both visually and physically plausible deformations at a speed approaching real-time. The results demonstrate great potential of our proposed method to enhance the efficiency and variety of simulations available for surgical education and robot learning.

Abstract (translated)

手术场景模拟在手术教育和基于模拟器的机器人学习中发挥着关键作用。传统的方法创建这些环境需要设计师花费大量的时间手工制作组织模型,纹理和几何数据,以实现软身体仿真。这种手动方法不仅费时,而且可扩展性和现实性有限。相比之下,数据驱动模拟提供了令人兴奋的替代方案。它有可能自动从现实世界的手术视频数据中重构3D手术场景,然后应用软身体物理学。然而,这个领域仍然相对未知。在我们的研究中,我们将3D高斯作为一个可学习表示手术场景的模型,从立体内窥镜视频中学到。为了防止过拟合并确保场景的几何正确性,我们将深度监督和各向同性正则化引入到高斯学习过程中。此外,我们将材料点方法应用于3D高斯,以实现逼真的场景变形。我们对内部和公共手术视频数据集进行了评估。结果表明,该方法可以高效地重构和模拟手术场景,仅用几分钟就可以重构手术场景,并产生几乎实时可观和物理变形。结果证明了我们对所提出方法的提高效率和多样性的潜力。

URL

https://arxiv.org/abs/2405.00956

PDF

https://arxiv.org/pdf/2405.00956.pdf

Efficient Data-driven Scene Simulation using Robotic Surgery Videos via Physics-embedded 3D Gaussians

Abstract

Abstract (translated)

URL

PDF Copy

PDF