NASH: Neural Architecture Search for Hardware-Optimized Machine Learning Models

Abstract
Abstract (translated)
URL
PDF

Abstract

As machine learning (ML) algorithms get deployed in an ever-increasing number of applications, these algorithms need to achieve better trade-offs between high accuracy, high throughput and low latency. This paper introduces NASH, a novel approach that applies neural architecture search to machine learning hardware. Using NASH, hardware designs can achieve not only high throughput and low latency but also superior accuracy performance. We present four versions of the NASH strategy in this paper, all of which show higher accuracy than the original models. The strategy can be applied to various convolutional neural networks, selecting specific model operations among many to guide the training process toward higher accuracy. Experimental results show that applying NASH on ResNet18 or ResNet34 achieves a top 1 accuracy increase of up to 3.1% and a top 5 accuracy increase of up to 2.2% compared to the non-NASH version when tested on the ImageNet data set. We also integrated this approach into the FINN hardware model synthesis tool to automate the application of our approach and the generation of the hardware model. Results show that using FINN can achieve a maximum throughput of 324.5 fps. In addition, NASH models can also result in a better trade-off between accuracy and hardware resource utilization. The accuracy-hardware (HW) Pareto curve shows that the models with the four NASH versions represent the best trade-offs achieving the highest accuracy for a given HW utilization. The code for our implementation is open-source and publicly available on GitHub at this https URL.

Abstract (translated)

随着机器学习（ML）算法在越来越多的应用程序中得到部署，这些算法需要实现高准确度、高吞吐量和低延迟之间的良好平衡。本文介绍了一种新的方法NASH，将神经架构搜索应用于机器学习硬件。使用NASH，硬件设计可以实现不仅具有高吞吐量和高延迟，而且具有卓越的准确度性能。本文展示了四个NASH策略版本，所有版本都表现出比原始模型更高的准确度。策略可以应用于各种卷积神经网络，从许多模型操作中选择特定的模型操作来引导训练过程朝着更高的准确度。在ImageNet数据集上进行测试时，应用NASH在ResNet18或ResNet34上，与非NASH版本相比，前者的Top 1准确度提高了3.1%，Top 5准确度提高了2.2%。我们还将这种方法集成到FINN硬件模型合成工具中，以自动应用我们的策略并生成硬件模型。结果表明，使用FINN可以达到最大吞吐量为324.5 fps。此外，NASH模型还可以实现准确度和硬件资源利用率之间的更好平衡。准确性-硬件（HW）帕累托曲线显示，四个NASH版本所表示的模型具有最高准确度，对于给定的硬件利用率。我们的实现代码是开源的，在GitHub上公开可用，链接为https://github.com/。

URL

https://arxiv.org/abs/2403.01845

PDF

https://arxiv.org/pdf/2403.01845.pdf

NASH: Neural Architecture Search for Hardware-Optimized Machine Learning Models

Abstract

Abstract (translated)

URL

PDF Copy

PDF