Abstract
This paper introduces Multi-Resolution Rescored Byte-Track (MR2-ByteTrack), a novel video object detection framework for ultra-low-power embedded processors. This method reduces the average compute load of an off-the-shelf Deep Neural Network (DNN) based object detector by up to 2.25$\times$ by alternating the processing of high-resolution images (320$\times$320 pixels) with multiple down-sized frames (192$\times$192 pixels). To tackle the accuracy degradation due to the reduced image input size, MR2-ByteTrack correlates the output detections over time using the ByteTrack tracker and corrects potential misclassification using a novel probabilistic Rescore algorithm. By interleaving two down-sized images for every high-resolution one as the input of different state-of-the-art DNN object detectors with our MR2-ByteTrack, we demonstrate an average accuracy increase of 2.16% and a latency reduction of 43% on the GAP9 microcontroller compared to a baseline frame-by-frame inference scheme using exclusively full-resolution images. Code available at: this https URL
Abstract (translated)
本文介绍了一种名为Multi-Resolution Rescored Byte-Track (MR2-ByteTrack)的新视频对象检测框架,用于低功耗嵌入式处理器。该方法通过交替处理高分辨率图像(320×320像素)和多个低分辨率帧(192×192像素),将基于深度神经网络(DNN)的定制对象检测器的平均计算负载降低至2.25倍。为了应对由于输入图像尺寸减少而导致的准确度下降,MR2-ByteTrack通过ByteTrack跟踪器在时间上相关联输出检测结果,并使用一种新颖的概率Rescore算法纠正潜在的误分类。通过将两个低分辨率图像作为每个高分辨率图像的输入,我们将MR2-ByteTrack应用于具有不同先进状态的DNN物体检测器,在GAP9微控制器上,与仅使用全分辨率图像的基准帧间推理方案相比,我们证明了平均准确度增加2.16%和延迟降低43%。代码可在此处下载:https://this URL。
URL
https://arxiv.org/abs/2404.11488