Abstract
We present a hardware-efficient architecture of convolutional neural network, which has a repvgg-like architecture. Flops or parameters are traditional metrics to evaluate the efficiency of networks which are not sensitive to hardware including computing ability and memory bandwidth. Thus, how to design a neural network to efficiently use the computing ability and memory bandwidth of hardware is a critical problem. This paper proposes a method how to design hardware-aware neural network. Based on this method, we designed EfficientRep series convolutional networks, which are high-computation hardware(e.g. GPU) friendly and applied in YOLOv6 object detection framework. YOLOv6 has published YOLOv6N/YOLOv6S/YOLOv6M/YOLOv6L models in v1 and v2 versions.
Abstract (translated)
我们介绍了一种高效的卷积神经网络架构,该架构具有 repvgg-like 的架构。浮点或参数是传统的度量标准,用于评估不依赖于硬件(包括计算能力和内存带宽)的网络的效率,因此,如何设计一种能够高效利用硬件计算能力和内存带宽的神经网络是一个关键问题。本文提出了一种方法,用于设计具有硬件 aware 功能的神经网络。基于这种方法,我们设计了 EfficientRep 系列卷积神经网络,这些网络在高计算硬件(如 GPU)友好的情况下,被应用于 YOLOv6 对象检测框架。YOLOv6 在 v1 和 v2 版本中发布了 YOLOv6N/YOLOv6S/YOLOv6M/YOLOv6L 模型。
URL
https://arxiv.org/abs/2302.00386