Paper Reading AI Learner

V-NAS: Neural Architecture Search for Volumetric Medical Image Segmentation

2019-06-06 21:07:40
Zhuotun Zhu, Chenxi Liu, Dong Yang, Alan Yuille, Daguang Xu

Abstract

Deep learning algorithms, in particular 2D and 3D fully convolutional neural networks (FCNs), have rapidly become the mainstream methodology for volumetric medical image segmentation. However, 2D convolutions cannot fully leverage the rich spatial information along the third axis, while 3D convolutions suffer from the demanding computation and high GPU memory consumption. In this paper, we propose to automatically search the network architecture tailoring to volumetric medical image segmentation problem. Concretely, we formulate the structure learning as differentiable neural architecture search, and let the network itself choose between 2D, 3D or Pseudo-3D (P3D) convolutions at each layer. We evaluate our method on 3 public datasets, i.e., the NIH Pancreas dataset, the Lung and Pancreas dataset from the Medical Segmentation Decathlon (MSD) Challenge. Our method, named V-NAS, consistently outperforms other state-of-the-arts on the segmentation task of both normal organ (NIH Pancreas) and abnormal organs (MSD Lung tumors and MSD Pancreas tumors), which shows the power of chosen architecture. Moreover, the searched architecture on one dataset can be well generalized to other datasets, which demonstrates the robustness and practical use of our proposed method.

Abstract (translated)

深度学习算法,特别是二维和三维全卷积神经网络(FCN)已迅速成为医学体图像分割的主流方法。然而,二维卷积不能充分利用沿第三轴的丰富空间信息,而三维卷积则存在计算量大、GPU内存消耗高的问题。本文提出了一种针对容积医学图像分割问题的网络结构裁剪自动搜索方法。具体地说,我们将结构学习描述为可微神经架构搜索,并让网络本身在每一层的二维、三维或伪三维(p3d)卷积之间进行选择。我们在3个公共数据集上评估我们的方法,即NIH胰腺数据集、来自医学分割十项全能(MSD)挑战的肺和胰腺数据集。我们的方法命名为v-nas,在正常器官(NIH胰腺)和异常器官(MSD肺肿瘤和MSD胰腺肿瘤)的分割任务上始终优于其他技术水平,显示了所选架构的强大功能。此外,一个数据集上的搜索架构可以很好地推广到其他数据集,这表明了我们所提出的方法的鲁棒性和实际应用。

URL

https://arxiv.org/abs/1906.02817

PDF

https://arxiv.org/pdf/1906.02817.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot