Paper Reading AI Learner

NASH: Neural Architecture Search for Hardware-Optimized Machine Learning Models

2024-03-04 08:51:38
Mengfei Ji, Zaid Al-Ars
     

Abstract

As machine learning (ML) algorithms get deployed in an ever-increasing number of applications, these algorithms need to achieve better trade-offs between high accuracy, high throughput and low latency. This paper introduces NASH, a novel approach that applies neural architecture search to machine learning hardware. Using NASH, hardware designs can achieve not only high throughput and low latency but also superior accuracy performance. We present four versions of the NASH strategy in this paper, all of which show higher accuracy than the original models. The strategy can be applied to various convolutional neural networks, selecting specific model operations among many to guide the training process toward higher accuracy. Experimental results show that applying NASH on ResNet18 or ResNet34 achieves a top 1 accuracy increase of up to 3.1% and a top 5 accuracy increase of up to 2.2% compared to the non-NASH version when tested on the ImageNet data set. We also integrated this approach into the FINN hardware model synthesis tool to automate the application of our approach and the generation of the hardware model. Results show that using FINN can achieve a maximum throughput of 324.5 fps. In addition, NASH models can also result in a better trade-off between accuracy and hardware resource utilization. The accuracy-hardware (HW) Pareto curve shows that the models with the four NASH versions represent the best trade-offs achieving the highest accuracy for a given HW utilization. The code for our implementation is open-source and publicly available on GitHub at this https URL.

Abstract (translated)

随着机器学习(ML)算法在越来越多的应用程序中得到部署,这些算法需要实现高准确度、高吞吐量和低延迟之间的良好平衡。本文介绍了一种新的方法NASH,将神经架构搜索应用于机器学习硬件。使用NASH,硬件设计可以实现不仅具有高吞吐量和高延迟,而且具有卓越的准确度性能。本文展示了四个NASH策略版本,所有版本都表现出比原始模型更高的准确度。策略可以应用于各种卷积神经网络,从许多模型操作中选择特定的模型操作来引导训练过程朝着更高的准确度。在ImageNet数据集上进行测试时,应用NASH在ResNet18或ResNet34上,与非NASH版本相比,前者的Top 1准确度提高了3.1%,Top 5准确度提高了2.2%。我们还将这种方法集成到FINN硬件模型合成工具中,以自动应用我们的策略并生成硬件模型。结果表明,使用FINN可以达到最大吞吐量为324.5 fps。此外,NASH模型还可以实现准确度和硬件资源利用率之间的更好平衡。准确性-硬件(HW)帕累托曲线显示,四个NASH版本所表示的模型具有最高准确度,对于给定的硬件利用率。我们的实现代码是开源的,在GitHub上公开可用,链接为https://github.com/。

URL

https://arxiv.org/abs/2403.01845

PDF

https://arxiv.org/pdf/2403.01845.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot