Paper Reading AI Learner

Neural Architecture Refinement: A Practical Way for Avoiding Overfitting in NAS

2019-05-07 03:41:12
Yang Jiang, Cong Zhao, Lei Pang

Abstract

Neural architecture search (NAS) is proposed to automate the architecture design process and attracts overwhelming interest from both academia and industry. However, it is confronted with overfitting issue due to the high-dimensional search space composed by $operator$ selection and $skip$ connection of each layer. This paper analyzes the overfitting issue from a novel perspective, which separates the primitives of search space into architecture-overfitting related and parameter-overfitting related elements. The $operator$ of each layer, which mainly contributes to parameter-overfitting and is important for model acceleration, is selected as our optimization target based on state-of-the-art architecture, meanwhile $skip$ which related to architecture-overfitting, is ignored. With the largely reduced search space, our proposed method is both quick to converge and practical to use in various tasks. Extensive experiments have demonstrated that the proposed method can achieve fascinated results, including classification, face recognition etc.

Abstract (translated)

神经架构搜索(NAS)是为了使架构设计过程自动化,引起学术界和业界的广泛关注。然而,由于$operator$selection和$skip$connection组成的高维搜索空间,它面临着过度拟合的问题。本文从一个新的角度分析了搜索空间的过拟合问题,将搜索空间的基本元素分为结构过拟合相关元素和参数过拟合相关元素。基于最先进的体系结构,选择了各层的$operator$作为优化目标,主要是参数过拟合,对模型加速很重要,而忽略了与体系结构过拟合相关的$skip$作为优化目标。在搜索空间大大减少的情况下,该方法收敛速度快,在各种任务中都具有实用性。大量的实验表明,该方法可以获得令人着迷的结果,包括分类、人脸识别等。

URL

https://arxiv.org/abs/1905.02341

PDF

https://arxiv.org/pdf/1905.02341.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot