Abstract
Cancer is an abnormal growth with potential to invade locally and metastasize to distant organs. Accurate auto-segmentation of the tumor and surrounding normal tissues is required for radiotherapy treatment plan optimization. Recent AI-based segmentation models are generally trained on large public datasets, which lack the heterogeneity of local patient populations. While these studies advance AI-based medical image segmentation, research on local datasets is necessary to develop and integrate AI tumor segmentation models directly into hospital software for efficient and accurate oncology treatment planning and execution. This study enhances tumor segmentation using computationally efficient hybrid UNet-Transformer models on magnetic resonance imaging (MRI) datasets acquired from a local hospital under strict privacy protection. We developed a robust data pipeline for seamless DICOM extraction and preprocessing, followed by extensive image augmentation to ensure model generalization across diverse clinical settings, resulting in a total dataset of 6080 images for training. Our novel architecture integrates UNet-based convolutional neural networks with a transformer bottleneck and complementary attention modules, including efficient attention, Squeeze-and-Excitation (SE) blocks, Convolutional Block Attention Module (CBAM), and ResNeXt blocks. To accelerate convergence and reduce computational demands, we used a maximum batch size of 8 and initialized the encoder with pretrained ImageNet weights, training the model on dual NVIDIA T4 GPUs via checkpointing to overcome Kaggle's runtime limits. Quantitative evaluation on the local MRI dataset yielded a Dice similarity coefficient of 0.764 and an Intersection over Union (IoU) of 0.736, demonstrating competitive performance despite limited data and underscoring the importance of site-specific model development for clinical deployment.
Abstract (translated)
癌症是一种具有局部侵犯和远处器官转移潜能的异常生长。为了优化放射治疗计划,准确地自动分割肿瘤及其周围正常组织是必需的。最近基于人工智能的分割模型通常是在大型公共数据集上训练的,这些数据集缺乏本地患者群体的多样性。虽然这些研究推动了基于AI的医学图像分割技术的发展,但使用本地数据进行研究对于开发和整合适用于医院软件的人工智能肿瘤分割模型以实现高效且准确的肿瘤治疗计划制定至关重要。 本研究利用来自当地医院并在严格隐私保护下获取的磁共振成像(MRI)数据集,通过计算效率高的混合UNet-Transformer模型来提高肿瘤分割效果。我们构建了一个强大的数据流水线,能够无缝提取和预处理DICOM文件,并进行了广泛的图像增强以确保在不同临床环境中模型的泛化能力,最终形成一个包含6080张训练图像的数据集。 我们的新型架构将基于UNet的卷积神经网络与变压器瓶颈以及互补注意模块(包括高效注意、挤压激励块(SE)、卷积块注意力模块(CBAM)和ResNeXt块)相结合。为了加速收敛并减少计算需求,我们使用了最大批量大小为8,并用预训练的ImageNet权重初始化编码器,在两个NVIDIA T4 GPU上通过检查点功能进行模型训练以克服Kaggle运行时间限制。 在本地MRI数据集上的定量评估显示,Dice相似系数为0.764,交并比(IoU)为0.736。尽管数据有限,这些结果仍然表明了竞争性的性能,并强调了开发特定于位置的模型对于临床部署的重要性。
URL
https://arxiv.org/abs/2506.15562