'It is okay to be uncommon': Quantizing Sound Event Detection Networks on Hardware Accelerators with Uncommon Sub-Byte Support

Abstract
Abstract (translated)
URL
PDF

Abstract

If our noise-canceling headphones can understand our audio environments, they can then inform us of important sound events, tune equalization based on the types of content we listen to, and dynamically adjust noise cancellation parameters based on audio scenes to further reduce distraction. However, running multiple audio understanding models on headphones with a limited energy budget and on-chip memory remains a challenging task. In this work, we identify a new class of neural network accelerators (e.g., NE16 on GAP9) that allows network weights to be quantized to different common (e.g., 8 bits) and uncommon bit-widths (e.g., 3 bits). We then applied a differentiable neural architecture search to search over the optimal bit-widths of a network on two different sound event detection tasks with potentially different requirements on quantization and prediction granularity (i.e., classification vs. embeddings for few-shot learning). We further evaluated our quantized models on actual hardware, showing that we reduce memory usage, inference latency, and energy consumption by an average of 62%, 46%, and 61% respectively compared to 8-bit models while maintaining floating point performance. Our work sheds light on the benefits of such accelerators on sound event detection tasks when combined with an appropriate search method.

Abstract (translated)

如果我们的消噪音耳机可以理解我们的音频环境，它们就可以告诉我们重要的事件声音，根据我们听的内容调整均衡，并根据音频场景动态调整降噪参数，从而进一步减少干扰。然而，在有限能源预算和芯片内存储器的耳机上运行多个音频理解模型仍然具有挑战性。在这项工作中，我们识别出一种新的神经网络加速器（例如，GAP9上的NE16）允许网络权重以不同的常见（例如8位）和罕见位宽（例如3位）进行量化。然后，我们应用了不同的神经网络架构搜索来搜索在两个不同的音频事件检测任务上的网络的最佳位宽。我们还进一步评估了我们的量化模型在实际硬件上的效果，结果表明，与8位模型相比，我们平均降低了62%、46%和61%的内存使用量、推理延迟和能耗。我们的工作揭示了在结合适当的搜索方法时，为声音事件检测任务提供这种加速器的益处。

URL

https://arxiv.org/abs/2404.04386

PDF

https://arxiv.org/pdf/2404.04386.pdf

'It is okay to be uncommon': Quantizing Sound Event Detection Networks on Hardware Accelerators with Uncommon Sub-Byte Support

Abstract

Abstract (translated)

URL

PDF Copy

PDF