Paper Reading AI Learner

GenerateCT: Text-Guided 3D Chest CT Generation

2023-05-25 13:16:39
Ibrahim Ethem Hamamci, Sezgin Er, Enis Simsar, Alperen Tezcan, Ayse Gulnihan Simsek, Furkan Almas, Sevval Nil Esirgun, Hadrien Reynaud, Sarthak Pati, Christian Bluethgen, Bjoern Menze

Abstract

Generative modeling has experienced substantial progress in recent years, particularly in text-to-image and text-to-video synthesis. However, the medical field has not yet fully exploited the potential of large-scale foundational models for synthetic data generation. In this paper, we introduce GenerateCT, the first method for text-conditional computed tomography (CT) generation, addressing the limitations in 3D medical imaging research and making our entire framework open-source. GenerateCT consists of a pre-trained large language model, a transformer-based text-conditional 3D chest CT generation architecture, and a text-conditional spatial super-resolution diffusion model. We also propose CT-ViT, which efficiently compresses CT volumes while preserving auto-regressiveness in-depth, enabling the generation of 3D CT volumes with variable numbers of axial slices. Our experiments demonstrate that GenerateCT can produce realistic, high-resolution, and high-fidelity 3D chest CT volumes consistent with medical language text prompts. We further investigate the potential of GenerateCT by training a model using generated CT volumes for multi-abnormality classification of chest CT volumes. Our contributions provide a valuable foundation for future research in text-conditional 3D medical image generation and have the potential to accelerate advancements in medical imaging research. Our code, pre-trained models, and generated data are available at this https URL.

Abstract (translated)

生成建模在近年来取得了显著进展,特别是在文本到图像和文本到视频合成方面。然而,医学领域尚未完全充分利用大规模基础模型生成合成数据的潜力。在本文中,我们介绍了GenerateCT,这是一种针对文本ConditionalComputedTomography(CT)生成的第一方法,解决了三维医学成像研究的局限性,使我们整个框架开源。GenerateCT由一个预先训练的大型语言模型、基于Transformer的文本Conditional3D胸部CT生成架构和一个文本Conditional空间超分辨率扩散模型组成。我们还提出了CT-ViT,它高效压缩CT体积,同时保持自回归性的深度,使能够生成具有不同 axial slices 的3DCT体积。我们的实验表明,GenerateCT可以与医学语言文本 prompts保持一致地生成现实、高分辨率和高逼真的3D胸部CT体积。我们进一步研究了GenerateCT的潜力,通过使用生成的CT体积训练一个模型,以对胸部CT体积的多个异常进行分类。我们的贡献为未来文本Conditional3D医学图像生成研究提供了宝贵的基础,并可能加速医学成像研究的前进。我们的代码、预训练模型和生成数据可在这个httpsURL上可用。

URL

https://arxiv.org/abs/2305.16037

PDF

https://arxiv.org/pdf/2305.16037.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot