Paper Reading AI Learner

UNet-2022: Exploring Dynamics in Non-isomorphic Architecture

2022-10-27 16:00:04
Jiansen Guo, Hong-Yu Zhou, Liansheng Wang, Yizhou Yu

Abstract

Recent medical image segmentation models are mostly hybrid, which integrate self-attention and convolution layers into the non-isomorphic architecture. However, one potential drawback of these approaches is that they failed to provide an intuitive explanation of why this hybrid combination manner is beneficial, making it difficult for subsequent work to make improvements on top of them. To address this issue, we first analyze the differences between the weight allocation mechanisms of the self-attention and convolution. Based on this analysis, we propose to construct a parallel non-isomorphic block that takes the advantages of self-attention and convolution with simple parallelization. We name the resulting U-shape segmentation model as UNet-2022. In experiments, UNet-2022 obviously outperforms its counterparts in a range segmentation tasks, including abdominal multi-organ segmentation, automatic cardiac diagnosis, neural structures segmentation, and skin lesion segmentation, sometimes surpassing the best performing baseline by 4%. Specifically, UNet-2022 surpasses nnUNet, the most recognized segmentation model at present, by large margins. These phenomena indicate the potential of UNet-2022 to become the model of choice for medical image segmentation.

Abstract (translated)

URL

https://arxiv.org/abs/2210.15566

PDF

https://arxiv.org/pdf/2210.15566.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot