Paper Reading AI Learner

AttoNets: Compact and Efficient Deep Neural Networks for the Edge via Human-Machine Collaborative Design

2019-03-18 00:16:04
Alexander Wong, Zhong Qiu Lin, Brendan Chwyl

Abstract

While deep neural networks have achieved state-of-the-art performance across a large number of complex tasks, it remains a big challenge to deploy such networks for practical, on-device edge scenarios such as on mobile devices, consumer devices, drones, and vehicles. In this study, we take a deeper exploration into a human-machine collaborative design approach for creating highly efficient deep neural networks through a synergy between principled network design prototyping and machine-driven design exploration. The efficacy of human-machine collaborative design is demonstrated through the creation of AttoNets, a family of highly efficient deep neural networks for on-device edge deep learning. Each AttoNet possesses a human-specified network-level macro-architecture comprising of custom modules with unique machine-designed module-level macro-architecture and micro-architecture designs, all driven by human-specified design requirements. Experimental results for the task of object recognition showed that the AttoNets created via human-machine collaborative design has significantly fewer parameters and computational costs than state-of-the-art networks designed for efficiency while achieving noticeably higher accuracy (with the smallest AttoNet achieving ~1.8% higher accuracy while requiring ~10x fewer multiply-add operations and parameters than MobileNet-V1). Furthermore, the efficacy of the AttoNets is demonstrated for the task of instance-level object segmentation and object detection, where an AttoNet-based Mask R-CNN network was constructed with significantly fewer parameters and computational costs (~5x fewer multiply-add operations and ~2x fewer parameters) than a ResNet-50 based Mask R-CNN network.

Abstract (translated)

虽然深度神经网络已经在大量复杂任务中实现了最先进的性能,但在实际的设备边缘场景(如移动设备、消费设备、无人机和车辆)中部署此类网络仍然是一个巨大的挑战。在这项研究中,我们深入探索了一种人机协同设计方法,通过原理性网络设计原型和机器驱动设计探索之间的协同作用,来创建高效的深层神经网络。人机协同设计的有效性通过创建ATTONET来证明,ATTONET是一个用于设备边缘深度学习的高效深层神经网络家族。每个ATTONET都拥有一个人类指定的网络级宏体系结构,由具有独特的机器设计模块级宏体系结构和微体系结构设计的自定义模块组成,所有这些都由人类指定的设计需求驱动。对目标识别任务的实验结果表明,通过人机协同设计创建的ATTONET在实现显著更高的精度的同时(最小的ATTONET达到约1.8%的精度,而与mobilenet-v1相比,需要大约10倍的乘法加法操作和参数)。此外,在实例级对象分割和对象检测任务中,证明了ATTONET的有效性,在该任务中,与基于Resnet-50的掩模R-CNN网络相比,基于ATTONET的掩模R-CNN网络的参数和计算成本显著减少(乘加运算减少约5倍,参数减少约2倍)。RK。

URL

https://arxiv.org/abs/1903.07209

PDF

https://arxiv.org/pdf/1903.07209.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot