TG-NAS: Leveraging Zero-Cost Proxies with Transformer and Graph Convolution Networks for Efficient Neural Architecture Search

Abstract
Abstract (translated)
URL
PDF

Abstract

Neural architecture search (NAS) is an effective method for discovering new convolutional neural network (CNN) architectures. However, existing approaches often require time-consuming training or intensive sampling and evaluations. Zero-shot NAS aims to create training-free proxies for architecture performance prediction. However, existing proxies have suboptimal performance, and are often outperformed by simple metrics such as model parameter counts or the number of floating-point operations. Besides, existing model-based proxies cannot be generalized to new search spaces with unseen new types of operators without golden accuracy truth. A universally optimal proxy remains elusive. We introduce TG-NAS, a novel model-based universal proxy that leverages a transformer-based operator embedding generator and a graph convolution network (GCN) to predict architecture performance. This approach guides neural architecture search across any given search space without the need of retraining. Distinct from other model-based predictor subroutines, TG-NAS itself acts as a zero-cost (ZC) proxy, guiding architecture search with advantages in terms of data independence, cost-effectiveness, and consistency across diverse search spaces. Our experiments showcase its advantages over existing proxies across various NAS benchmarks, suggesting its potential as a foundational element for efficient architecture search. TG-NAS achieves up to 300X improvements in search efficiency compared to previous SOTA ZC proxy methods. Notably, it discovers competitive models with 93.75% CIFAR-10 accuracy on the NAS-Bench-201 space and 74.5% ImageNet top-1 accuracy on the DARTS space.

Abstract (translated)

神经架构搜索（NAS）是一种有效的发现新卷积神经网络（CNN）架构的方法。然而，现有的方法通常需要耗时的训练或密集的抽样和评估。零样本NAS旨在创建用于架构性能预测的训练免费代理。然而，现有的代理具有亚优性能，并且通常被简单的指标（如模型参数计数或浮点操作数）所超越。此外，现有的基于模型的代理不能推广到未见过的搜索空间，在没有黄金准确性真相的情况下，对新类型操作器没有指导作用。普遍最优代理仍然是遥不可及的。我们引入了TG-NAS，一种新颖的基于模型的通用代理，它利用了Transformer-based operator embedding generator和图卷积网络（GCN）来预测架构性能。这种方法在任意搜索空间上指导神经架构搜索，无需重新训练。与其他模型基预测器子程序相比，TG-NAS本身充当零成本（ZC）代理，在数据独立性、成本效益和多样性搜索空间中的准确性方面具有优势。我们的实验展示了TG-NAS在各种NAS基准上的优势，表明其可能是有效架构搜索的基础元素。TG-NAS在搜索效率上实现了比之前SOTA ZC代理方法高达300倍的提升。值得注意的是，它在新兴NAS基准空间上发现了具有93.75% CIFAR-10准确性的竞争模型，在DARTS空间上具有74.5%的ImageNet top-1准确率。

URL

https://arxiv.org/abs/2404.00271

PDF

https://arxiv.org/pdf/2404.00271.pdf

TG-NAS: Leveraging Zero-Cost Proxies with Transformer and Graph Convolution Networks for Efficient Neural Architecture Search

Abstract

Abstract (translated)

URL

PDF Copy

PDF