Abstract
The success of deep neural networks relies on significant architecture engineering. Recently neural architecture search (NAS) has emerged as a promise to greatly reduce manual effort in network design by automatically searching for optimal architectures, although typically such algorithms need an excessive amount of computational resources, e.g., a few thousand GPU-days. To date, on challenging vision tasks such as object detection, NAS, especially fast versions of NAS, is less studied. Here we propose to search for the decoder structure of object detectors with search efficiency being taken into consideration. To be more specific, we aim to efficiently search for the feature pyramid network (FPN) as well as the prediction head of a simple anchor-free object detector, namely FCOS [20], using a tailored reinforcement learning paradigm. With carefully designed search space, search algorithms and strategies for evaluating network quality, we are able to efficiently search more than 2, 000 architectures in around 30 GPU-days. The discovered architecture surpasses state-of-the-art object detection models (such as Faster R-CNN, RetinaNet and FCOS) by 1 to 1.9 points in AP on the COCO dataset, with comparable computation complexity and memory footprint, demonstrating the efficacy of the proposed NAS for object detection.
Abstract (translated)
深度神经网络的成功依赖于重要的建筑工程。最近,神经架构搜索(NAS)已成为一种承诺,通过自动搜索最佳架构,大大减少网络设计中的人工工作,尽管通常这种算法需要大量的计算资源,例如几千gpu天。到目前为止,对于具有挑战性的视觉任务,如目标检测,NAS,尤其是快速版本的NAS,研究较少。在这里,我们建议搜索目标检测器的解码器结构,并考虑搜索效率。更具体地说,我们的目标是使用定制的强化学习范式,高效地搜索特征金字塔网络(FPN)以及简单无锚目标检测器的预测头,即fcos[20]。凭借精心设计的搜索空间、搜索算法和评估网络质量的策略,我们能够在大约30 gpu天内有效搜索2000多个架构。在Coco数据集上,发现的架构比最先进的对象检测模型(如更快的R-CNN、Retinanet和FCOS)在AP中超过了1到1.9个点,具有可比的计算复杂性和内存足迹,证明了所提出的NAS用于对象检测的有效性。
URL
https://arxiv.org/abs/1906.04423