Abstract
Deep learning-based models are at the forefront of most driver observation benchmarks due to their remarkable accuracies but are also associated with high computational costs. This is challenging, as resources are often limited in real-world driving scenarios. This paper introduces a lightweight framework for resource-efficient driver activity recognition. The framework enhances 3D MobileNet, a neural architecture optimized for speed in video classification, by incorporating knowledge distillation and model quantization to balance model accuracy and computational efficiency. Knowledge distillation helps maintain accuracy while reducing the model size by leveraging soft labels from a larger teacher model (I3D), instead of relying solely on original ground truth data. Model quantization significantly lowers memory and computation demands by using lower precision integers for model weights and activations. Extensive testing on a public dataset for in-vehicle monitoring during autonomous driving demonstrates that this new framework achieves a threefold reduction in model size and a 1.4-fold improvement in inference time, compared to an already optimized architecture. The code for this study is available at this https URL.
Abstract (translated)
基于深度学习的模型在大多数驾驶观察基准测试中处于领先地位,因为它们的令人印象深刻的精度,但同时也导致了高计算成本。这很具有挑战性,因为在现实世界的驾驶场景中,资源常常有限。本文介绍了一种轻量级的资源高效的驾驶员活动识别框架。该框架通过引入知识蒸馏和模型量化来平衡模型的准确性和计算效率。知识蒸馏通过利用较大模型的软标签来维持准确性,而不是仅依赖原始真实数据。模型量化通过使用较低精度的整数来量化模型权重和激活实现显著的降低内存和计算需求。在自动驾驶期间车内监测的公开数据集上进行的大量测试表明,与已经优化的架构相比,新框架的模型大小降低了三倍,推理时间提高了1.4倍。本研究的代码可以从以下链接获取。
URL
https://arxiv.org/abs/2311.05970