Paper Reading AI Learner

SWAP-NAS: Sample-Wise Activation Patterns For Ultra-Fast NAS

2024-03-07 02:40:42
Yameng Peng, Andy Song, Haytham M. Fayek, Vic Ciesielski, Xiaojun Chang

Abstract

Training-free metrics (a.k.a. zero-cost proxies) are widely used to avoid resource-intensive neural network training, especially in Neural Architecture Search (NAS). Recent studies show that existing training-free metrics have several limitations, such as limited correlation and poor generalisation across different search spaces and tasks. Hence, we propose Sample-Wise Activation Patterns and its derivative, SWAP-Score, a novel high-performance training-free metric. It measures the expressivity of networks over a batch of input samples. The SWAP-Score is strongly correlated with ground-truth performance across various search spaces and tasks, outperforming 15 existing training-free metrics on NAS-Bench-101/201/301 and TransNAS-Bench-101. The SWAP-Score can be further enhanced by regularisation, which leads to even higher correlations in cell-based search space and enables model size control during the search. For example, Spearman's rank correlation coefficient between regularised SWAP-Score and CIFAR-100 validation accuracies on NAS-Bench-201 networks is 0.90, significantly higher than 0.80 from the second-best metric, NWOT. When integrated with an evolutionary algorithm for NAS, our SWAP-NAS achieves competitive performance on CIFAR-10 and ImageNet in approximately 6 minutes and 9 minutes of GPU time respectively.

Abstract (translated)

无训练指标(也称为零成本代理)广泛用于避免资源密集型神经网络训练,特别是在神经架构搜索(NAS)中。最近的研究表明,现有的训练指标具有多个局限性,如在不同搜索空间和任务上的相关性有限和泛化性能差。因此,我们提出了样本加权激活模式及其导数(SWAP-Score),一种新颖的高性能训练指标。它衡量网络在批输入样本上的表现力。SWAP-Score在各种搜索空间和任务上的地面真值性能上高度相关,在NAS-Bench-101/201/301和TransNAS-Bench-101上优于15个现有训练指标。通过正则化可以进一步增强SWAP-Score,从而在基于细胞的搜索空间中实现更高的相关性,并在搜索过程中实现模型大小的控制。例如,在NAS-Bench-201网络上的正常斯皮尔曼秩相关系数 between 经过正则化的SWAP-Score和CIFAR-100验证准确率之间的比值约为0.90,比第二好的指标NWOT高出约0.80。当与进化算法集成时,我们的SWAP-NAS在分别大约6分钟和9分钟的GPU时间内在CIFAR-10和ImageNet上实现竞争力的性能。

URL

https://arxiv.org/abs/2403.04161

PDF

https://arxiv.org/pdf/2403.04161.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot