Paper Reading AI Learner

CaraNet: Context Axial Reverse Attention Network for Segmentation of Small Medical Objects

2023-01-31 02:12:33
Ange Lou, Shuyue Guan, Murray Loew

Abstract

Segmenting medical images accurately and reliably is important for disease diagnosis and treatment. It is a challenging task because of the wide variety of objects' sizes, shapes, and scanning modalities. Recently, many convolutional neural networks (CNN) have been designed for segmentation tasks and achieved great success. Few studies, however, have fully considered the sizes of objects, and thus most demonstrate poor performance for small objects segmentation. This can have a significant impact on the early detection of diseases. This paper proposes a Context Axial Reverse Attention Network (CaraNet) to improve the segmentation performance on small objects compared with several recent state-of-the-art models. CaraNet applies axial reserve attention (ARA) and channel-wise feature pyramid (CFP) module to dig feature information of small medical object. And we evaluate our model by six different measurement metrics. We test our CaraNet on brain tumor (BraTS 2018) and polyp (Kvasir-SEG, CVC-ColonDB, CVC-ClinicDB, CVC-300, and ETIS-LaribPolypDB) segmentation datasets. Our CaraNet achieves the top-rank mean Dice segmentation accuracy, and results show a distinct advantage of CaraNet in the segmentation of small medical objects.

Abstract (translated)

对医学图像进行准确可靠的分割对于疾病诊断和治疗非常重要。由于物体的大小、形状和扫描方式的多样性,这是一个具有挑战性的任务。近年来,许多卷积神经网络(CNN)已经被设计用于分割任务并取得了巨大的成功。然而,只有少数研究 fully 考虑了物体的大小,因此大多数物体分割表现都非常差。这可能对疾病的早期检测产生重大影响。本文提出了一种Context Axial Reverse Attention Network (CaraNet)来改善小物体分割性能,相比于 several recent state-of-the-art models。CaraNet应用axial储备注意力(ARA)和通道特征金字塔(CFP)模块来挖掘小医疗物体的特征信息。我们使用六个不同的测量指标来评估我们的模型。我们对脑瘤(BraTS 2018)和息肉(Kvasir-SEG,CVC-ColonDB,CVC-ClinicDB,CVC-300,和ETIS-Larib息肉DB)分割数据集进行了测试。我们的CaraNet实现了最高的Dice分割精度,并且结果表明,CaraNet在小物体分割方面具有显著优势。

URL

https://arxiv.org/abs/2301.13366

PDF

https://arxiv.org/pdf/2301.13366.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot