RadRotator: 3D Rotation of Radiographs with Diffusion Models

Abstract
Abstract (translated)
URL
PDF

Abstract

Transforming two-dimensional (2D) images into three-dimensional (3D) volumes is a well-known yet challenging problem for the computer vision community. In the medical domain, a few previous studies attempted to convert two or more input radiographs into computed tomography (CT) volumes. Following their effort, we introduce a diffusion model-based technology that can rotate the anatomical content of any input radiograph in 3D space, potentially enabling the visualization of the entire anatomical content of the radiograph from any viewpoint in 3D. Similar to previous studies, we used CT volumes to create Digitally Reconstructed Radiographs (DRRs) as the training data for our model. However, we addressed two significant limitations encountered in previous studies: 1. We utilized conditional diffusion models with classifier-free guidance instead of Generative Adversarial Networks (GANs) to achieve higher mode coverage and improved output image quality, with the only trade-off being slower inference time, which is often less critical in medical applications; and 2. We demonstrated that the unreliable output of style transfer deep learning (DL) models, such as Cycle-GAN, to transfer the style of actual radiographs to DRRs could be replaced with a simple yet effective training transformation that randomly changes the pixel intensity histograms of the input and ground-truth imaging data during training. This transformation makes the diffusion model agnostic to any distribution variations of the input data pixel intensity, enabling the reliable training of a DL model on input DRRs and applying the exact same model to conventional radiographs (or DRRs) during inference.

Abstract (translated)

将二维（2D）图像转换为三维（3D）体积在计算机视觉领域是一个众所周知但具有挑战性的问题。在医学领域，之前的一些研究表明，将两张或多张输入X光片转换为计算机断层扫描（CT）体积是可能的。他们的努力之后，我们引入了一种基于扩散模型的技术，该技术可以旋转任何输入X光片在3D空间中的解剖内容，从而有可能从任何角度观察到整个X光片的解剖内容。与之前的研究类似，我们使用CT体积来创建数字重建X光片（DRRs）作为模型的训练数据。然而，我们在之前的研究中遇到了两个显著的局限性：1.我们使用条件扩散模型（无分类指导）而不是生成对抗网络（GANs）来实现更高的模态覆盖和改善的输出图像质量，唯一的代价是推理时间更快，这在医疗应用中并不关键；2.我们证明了将深度学习模型（如循环神经网络）的风格迁移到DRRs的不确定输出可以被简单而有效的训练转换所取代，该转换在训练过程中随机改变输入和真实成像数据的像素强度直方图。这种转换使扩散模型对输入数据的像素强度分布变化具有免疫力，从而能够可靠地对DL模型在DRRs上的训练以及对常规X光片（或DRRs）的应用进行相同的模型。

URL

https://arxiv.org/abs/2404.13000

PDF

https://arxiv.org/pdf/2404.13000.pdf

RadRotator: 3D Rotation of Radiographs with Diffusion Models

Abstract

Abstract (translated)

URL

PDF Copy

PDF