Paper Reading AI Learner

YUVMultiNet: Real-time YUV multi-task CNN for autonomous driving

2019-04-11 13:08:05
Thomas Boulay, Said El-Hachimi, Mani Kumar Surisetti, Pullarao Maddu, Saranya Kandan

Abstract

In this paper, we propose a multi-task convolutional neural network (CNN) architecture optimized for a low power automotive grade SoC. We introduce a network based on a unified architecture where the encoder is shared among the two tasks namely detection and segmentation. The pro-posed network runs at 25FPS for 1280x800 resolution. We briefly discuss the methods used to optimize the network architecture such as using native YUV image directly, optimization of layers & feature maps and applying quantization. We also focus on memory bandwidth in our design as convolutions are data intensives and most SOCs are bandwidth bottlenecked. We then demonstrate the efficiency of our proposed network for a dedicated CNN accelerators presenting the key performance indicators (KPI) for the detection and segmentation tasks obtained from the hardware execution and the corresponding run-time.

Abstract (translated)

本文提出了一种针对低功耗汽车级SOC的多任务卷积神经网络(CNN)结构。我们介绍了一种基于统一体系结构的网络,其中编码器在检测和分割这两个任务之间共享。pro-posited网络以25fps的速度运行,分辨率为1280x800。本文简要讨论了网络结构优化的方法,如直接使用本机YUV图像、优化层和特征图以及应用量化等。在我们的设计中,我们也关注内存带宽,因为卷积是数据强度,而大多数SOC都是带宽瓶颈。然后,我们展示了我们针对专用CNN加速器的提议网络的效率,展示了从硬件执行和相应运行时获得的检测和分段任务的关键性能指标(KPI)。

URL

https://arxiv.org/abs/1904.05673

PDF

https://arxiv.org/pdf/1904.05673.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot