Training Full Spike Neural Networks via Auxiliary Accumulation Pathway

Abstract
Abstract (translated)
URL
PDF

Abstract

Due to the binary spike signals making converting the traditional high-power multiply-accumulation (MAC) into a low-power accumulation (AC) available, the brain-inspired Spiking Neural Networks (SNNs) are gaining more and more attention. However, the binary spike propagation of the Full-Spike Neural Networks (FSNN) with limited time steps is prone to significant information loss. To improve performance, several state-of-the-art SNN models trained from scratch inevitably bring many non-spike operations. The non-spike operations cause additional computational consumption and may not be deployed on some neuromorphic hardware where only spike operation is allowed. To train a large-scale FSNN with high performance, this paper proposes a novel Dual-Stream Training (DST) method which adds a detachable Auxiliary Accumulation Pathway (AAP) to the full spiking residual networks. The accumulation in AAP could compensate for the information loss during the forward and backward of full spike propagation, and facilitate the training of the FSNN. In the test phase, the AAP could be removed and only the FSNN remained. This not only keeps the lower energy consumption but also makes our model easy to deploy. Moreover, for some cases where the non-spike operations are available, the APP could also be retained in test inference and improve feature discrimination by introducing a little non-spike consumption. Extensive experiments on ImageNet, DVS Gesture, and CIFAR10-DVS datasets demonstrate the effectiveness of DST.

Abstract (translated)

由于二进制 spike 信号能够将传统的高能量累加(MAC)转换为低能量累加(AC)可用,脑 inspired Spiking Neural Networks(SNNs) 正在越来越受到关注。然而,对于具有有限时间步长的全突触神经网络(FSNN),其二进制突触传播容易带来重大信息损失。为了改善性能, several state-of-the-art SNN 模型从 scratch 开始训练不可避免地会涉及到许多非突触操作。这些非突触操作会增加额外的计算消耗,并且可能无法部署在只允许突触操作的神经可塑性硬件上。为了训练大型高性能的FSNN,本 paper 提出了一种新的双重流训练(DST)方法,该方法将 detachable auxiliary 累加路径(AAP)添加到全突触残留网络中。在AAp的累积中,可以补偿全突触传播过程中的信息损失,并促进FSNN的训练。在测试阶段,AAp可以删除,仅保留FSNN。这不仅保持较低的能源消耗,而且使模型易于部署。此外,对于可用非突触操作的情况,本 paper 还可以在测试推理中保留 APP,并通过引入少量的非突触消耗来提高特征区分度。对ImageNet、DVS Gesture 和 CIFAR10-DVS 数据集的广泛实验证明了dst 的有效性。

URL

https://arxiv.org/abs/2301.11929

PDF

https://arxiv.org/pdf/2301.11929.pdf