Paper Reading AI Learner

M3BUNet: Mobile Mean Max UNet for Pancreas Segmentation on CT-Scans

2024-01-18 23:10:08
Juwita juwita, Ghulam Mubashar Hassan, Naveed Akhtar, Amitava Datta

Abstract

Segmenting organs in CT scan images is a necessary process for multiple downstream medical image analysis tasks. Currently, manual CT scan segmentation by radiologists is prevalent, especially for organs like the pancreas, which requires a high level of domain expertise for reliable segmentation due to factors like small organ size, occlusion, and varying shapes. When resorting to automated pancreas segmentation, these factors translate to limited reliable labeled data to train effective segmentation models. Consequently, the performance of contemporary pancreas segmentation models is still not within acceptable ranges. To improve that, we propose M3BUNet, a fusion of MobileNet and U-Net neural networks, equipped with a novel Mean-Max (MM) attention that operates in two stages to gradually segment pancreas CT images from coarse to fine with mask guidance for object detection. This approach empowers the network to surpass segmentation performance achieved by similar network architectures and achieve results that are on par with complex state-of-the-art methods, all while maintaining a low parameter count. Additionally, we introduce external contour segmentation as a preprocessing step for the coarse stage to assist in the segmentation process through image standardization. For the fine segmentation stage, we found that applying a wavelet decomposition filter to create multi-input images enhances pancreas segmentation performance. We extensively evaluate our approach on the widely known NIH pancreas dataset and MSD pancreas dataset. Our approach demonstrates a considerable performance improvement, achieving an average Dice Similarity Coefficient (DSC) value of up to 89.53% and an Intersection Over Union (IOU) score of up to 81.16 for the NIH pancreas dataset, and 88.60% DSC and 79.90% IOU for the MSD Pancreas dataset.

Abstract (translated)

CT扫描图像分割是一个必要的步骤,用于许多下游医学图像分析任务。目前,由放射科医生手动进行的CT扫描分割很普遍,尤其是在像胰腺这样的小器官上,这需要高领域的专业知识才能实现可靠的分割,因为它们具有小器官尺寸、遮挡和形状变化等因素。当采用自动胰腺分割时,这些因素导致有限的可靠标签数据来训练有效的分割模型,因此当代胰腺分割模型的性能仍然无法达到可接受范围。为了改进这一点,我们提出了M3BUNet,一种将Mobilenet和U-Net神经网络相融合的方法,配备了新颖的Mean-Max(MM)注意,该注意在两个阶段进行,逐渐从粗到细分割胰腺CT图像,并使用遮罩指导进行对象检测。这种方法使网络能够超越具有类似网络架构的分割性能,并实现与复杂先进方法相当的结果,同时保持较低的参数计数。此外,我们引入外部轮廓分割作为粗 stage 的预处理步骤,以帮助分割过程通过图像标准化。对于细分割 stage,我们发现应用波浪函数分解滤波器创建多输入图像可以增强胰腺分割性能。我们对该方法在广泛知名的NIH胰腺数据集和MSD胰腺数据集上进行了广泛评估。我们的方法表现出显著的性能改进,在NIH胰腺数据集上的平均Dice相似性系数(DSC)值达到89.53%,IOU score达到81.16;在MSD胰腺数据集上,DSC和IOU分别为88.60%和79.90%。

URL

https://arxiv.org/abs/2401.10419

PDF

https://arxiv.org/pdf/2401.10419.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot