Abstract
During the last years, there has been a lot of interest in achieving some kind of complex reasoning using deep neural networks. To do that, models like Memory Networks (MemNNs) have combined external memory storages and attention mechanisms. These architectures, however, lack of more complex reasoning mechanisms that could allow, for instance, relational reasoning. Relation Networks (RNs), on the other hand, have shown outstanding results in relational reasoning tasks. Unfortunately, their computational cost grows quadratically with the number of memories, something prohibitive for larger problems. To solve these issues, we introduce the Working Memory Network, a MemNN architecture with a novel working memory storage and reasoning module. Our model retains the relational reasoning abilities of the RN while reducing its computational complexity from quadratic to linear. We tested our model on the text QA dataset bAbI and the visual QA dataset NLVR. In the jointly trained bAbI-10k, we set a new state-of-the-art, achieving a mean error of less than 0.5%. Moreover, a simple ensemble of two of our models solves all 20 tasks in the joint version of the benchmark.
Abstract (translated)
在过去的几年里,人们对使用深度神经网络实现某种复杂的推理有很大的兴趣。为此,Memory Networks(MemNNs)等模型将外部存储器和注意机制结合在一起。但是,这些体系结构缺乏更复杂的推理机制,例如可能会导致关系推理。另一方面,关系网络(RN)在关系推理任务中表现出优异的结果。不幸的是,它们的计算成本随着存储器数量的增加而呈二次曲线增长,这对于更大的问题而言是不可接受的。为了解决这些问题,我们引入了工作存储器网络,一种带有新型工作存储器存储和推理模块的MemNN架构。我们的模型保留了RN的关系推理能力,同时将其计算复杂度从二次方降低到线性。我们在文本QA数据集bAbI和视觉QA数据集NLVR上测试了我们的模型。在联合训练的bAbI-10k中,我们设置了一个新的技术水平,实现了小于0.5%的平均误差。此外,我们两个模型的简单集合解决了联合版本基准测试中的所有20个任务。
URL
https://arxiv.org/abs/1805.09354