Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems

Abstract
Abstract (translated)
URL
PDF

Abstract

This paper introduces Multi-Resolution Rescored Byte-Track (MR2-ByteTrack), a novel video object detection framework for ultra-low-power embedded processors. This method reduces the average compute load of an off-the-shelf Deep Neural Network (DNN) based object detector by up to 2.25$\times$ by alternating the processing of high-resolution images (320$\times$320 pixels) with multiple down-sized frames (192$\times$192 pixels). To tackle the accuracy degradation due to the reduced image input size, MR2-ByteTrack correlates the output detections over time using the ByteTrack tracker and corrects potential misclassification using a novel probabilistic Rescore algorithm. By interleaving two down-sized images for every high-resolution one as the input of different state-of-the-art DNN object detectors with our MR2-ByteTrack, we demonstrate an average accuracy increase of 2.16% and a latency reduction of 43% on the GAP9 microcontroller compared to a baseline frame-by-frame inference scheme using exclusively full-resolution images. Code available at: this https URL

Abstract (translated)

本文介绍了一种名为Multi-Resolution Rescored Byte-Track (MR2-ByteTrack)的新视频对象检测框架，用于低功耗嵌入式处理器。该方法通过交替处理高分辨率图像（320×320像素）和多个低分辨率帧（192×192像素），将基于深度神经网络（DNN）的定制对象检测器的平均计算负载降低至2.25倍。为了应对由于输入图像尺寸减少而导致的准确度下降，MR2-ByteTrack通过ByteTrack跟踪器在时间上相关联输出检测结果，并使用一种新颖的概率Rescore算法纠正潜在的误分类。通过将两个低分辨率图像作为每个高分辨率图像的输入，我们将MR2-ByteTrack应用于具有不同先进状态的DNN物体检测器，在GAP9微控制器上，与仅使用全分辨率图像的基准帧间推理方案相比，我们证明了平均准确度增加2.16%和延迟降低43%。代码可在此处下载：https://this URL。

URL

https://arxiv.org/abs/2404.11488

PDF

https://arxiv.org/pdf/2404.11488.pdf

Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems

Abstract

Abstract (translated)

URL

PDF Copy

PDF