Abstract
XyloAudio is a line of ultra-low-power audio inference chips, designed for in- and near-microphone analysis of audio in real-time energy-constrained scenarios. Xylo is designed around a highly efficient integer-logic processor which simulates parameter- and activity-sparse spiking neural networks (SNNs) using a leaky integrate-and-fire (LIF) neuron model. Neurons on Xylo are quantised integer devices operating in synchronous digital CMOS, with neuron and synapse state quantised to 16 bit, and weight parameters quantised to 8 bit. Xylo is tailored for real-time streaming operation, as opposed to accelerated-time operation in the case of an inference accelerator. XyloAudio includes a low-power audio encoding interface for direct connection to a microphone, designed for sparse encoding of incident audio for further processing by the inference core. In this report we present the results of DCASE 2020 acoustic scene classification audio benchmark dataset deployed to XyloAudio 2. We describe the benchmark dataset; the audio preprocessing approach; and the network architecture and training approach. We present the performance of the trained model, and the results of power and latency measurements performed on the XyloAudio 2 development kit. This benchmark is conducted as part of the Neurobench project.
Abstract (translated)
XyloAudio 是一款超低功耗音频推理芯片系列,专为实时能量受限场景下的麦克风内和近麦克风音频分析设计。Xylo围绕一个高效的整数逻辑处理器构建,该处理器使用泄漏积分放电(LIF)神经元模型模拟参数稀疏和活动稀疏的脉冲神经网络(SNN)。Xylo上的神经元是量化的整数设备,在同步数字CMOS中运行,其中神经元和突触状态量化为16位,权重参数量化为8位。Xylo专为实时流媒体操作设计,而不是像推理加速器那样进行加速时间操作。XyloAudio 包含一个低功耗音频编码接口,可以直接连接到麦克风,用于稀疏编码传入的音频以便进一步由推理核心处理。 在这份报告中,我们展示了将DCASE 2020声景分类音频基准数据集部署到XyloAudio 2上的结果。我们描述了基准数据集;音频预处理方法;以及网络架构和训练方法。我们还呈现了训练模型的性能,并展示了在XyloAudio 2开发套件上进行的功耗和延迟测量的结果。这项基准测试是作为Neurobench项目的一部分而进行的。
URL
https://arxiv.org/abs/2410.23776