Abstract
Generative modeling has experienced substantial progress in recent years, particularly in text-to-image and text-to-video synthesis. However, the medical field has not yet fully exploited the potential of large-scale foundational models for synthetic data generation. In this paper, we introduce GenerateCT, the first method for text-conditional computed tomography (CT) generation, addressing the limitations in 3D medical imaging research and making our entire framework open-source. GenerateCT consists of a pre-trained large language model, a transformer-based text-conditional 3D chest CT generation architecture, and a text-conditional spatial super-resolution diffusion model. We also propose CT-ViT, which efficiently compresses CT volumes while preserving auto-regressiveness in-depth, enabling the generation of 3D CT volumes with variable numbers of axial slices. Our experiments demonstrate that GenerateCT can produce realistic, high-resolution, and high-fidelity 3D chest CT volumes consistent with medical language text prompts. We further investigate the potential of GenerateCT by training a model using generated CT volumes for multi-abnormality classification of chest CT volumes. Our contributions provide a valuable foundation for future research in text-conditional 3D medical image generation and have the potential to accelerate advancements in medical imaging research. Our code, pre-trained models, and generated data are available at this https URL.
Abstract (translated)
生成建模在近年来取得了显著进展,特别是在文本到图像和文本到视频合成方面。然而,医学领域尚未完全充分利用大规模基础模型生成合成数据的潜力。在本文中,我们介绍了GenerateCT,这是一种针对文本ConditionalComputedTomography(CT)生成的第一方法,解决了三维医学成像研究的局限性,使我们整个框架开源。GenerateCT由一个预先训练的大型语言模型、基于Transformer的文本Conditional3D胸部CT生成架构和一个文本Conditional空间超分辨率扩散模型组成。我们还提出了CT-ViT,它高效压缩CT体积,同时保持自回归性的深度,使能够生成具有不同 axial slices 的3DCT体积。我们的实验表明,GenerateCT可以与医学语言文本 prompts保持一致地生成现实、高分辨率和高逼真的3D胸部CT体积。我们进一步研究了GenerateCT的潜力,通过使用生成的CT体积训练一个模型,以对胸部CT体积的多个异常进行分类。我们的贡献为未来文本Conditional3D医学图像生成研究提供了宝贵的基础,并可能加速医学成像研究的前进。我们的代码、预训练模型和生成数据可在这个httpsURL上可用。
URL
https://arxiv.org/abs/2305.16037