Abstract
Self-supervised learning of convolutional neural networks can harness large amounts of cheap unlabeled data to train powerful feature representations. As surrogate task, we jointly address ordering of visual data in the spatial and temporal domain. The permutations of training samples, which are at the core of self-supervision by ordering, have so far been sampled randomly from a fixed preselected set. Based on deep reinforcement learning we propose a sampling policy that adapts to the state of the network, which is being trained. Therefore, new permutations are sampled according to their expected utility for updating the convolutional feature representation. Experimental evaluation on unsupervised and transfer learning tasks demonstrates competitive performance on standard benchmarks for image and video classification and nearest neighbor retrieval.
Abstract (translated)
卷积神经网络的自我监督学习可以利用大量廉价的未标记数据来训练强大的特征表示。作为代理任务,我们共同解决空间和时间域中视觉数据的排序。迄今为止,训练样本的排列是通过排序进行自我监督的核心,从固定的预选集合中随机抽样。基于深度强化学习,我们提出了一种采样策略,该策略适应正在训练的网络状态。因此,根据用于更新卷积特征表示的期望效用对新排列进行采样。对无监督和转移学习任务的实验评估证明了在图像和视频分类以及最近邻检索的标准基准上的竞争性能。
URL
https://arxiv.org/abs/1807.11293