DPUV3INT8: A Compiler View to programmable FPGA Inference Engines

2021-10-08 18:33:12

Paolo D'Alberto, Jiangsha Ma, Jintao Li, Yiming Hu, Manasa Bollavaram, Shaoxia Fang

arXiv_CL

arXiv_CL Inference Optimization

Abstract
Abstract (translated)
URL
PDF

Abstract

We have a FPGA design, we make it fast, efficient, and tested for a few important examples. Now we must infer a general solution to deploy in the data center. Here, we describe the FPGA DPUV3INT8 design and our compiler effort. The hand-tuned SW-HW solution for Resnet50\_v1 has (close to) 2 times better images per second (throughput) than our best FPGA implementation; the compiler generalizes the hand written techniques achieving about 1.5 times better performance for the same example, the compiler generalizes the optimizations to a model zoo of networks, and it achieves 80+\% HW efficiency.

Abstract (translated)

URL

https://arxiv.org/abs/2110.04327

PDF

https://arxiv.org/pdf/2110.04327.pdf