Abstract
Capsule networks are a type of neural network that identify image parts and form the instantiation parameters of a whole hierarchically. The goal behind the network is to perform an inverse computer graphics task, and the network parameters are the mapping weights that transform parts into a whole. The trainability of capsule networks in complex data with high intra-class or intra-part variation is challenging. This paper presents a multi-prototype architecture for guiding capsule networks to represent the variations in the image parts. To this end, instead of considering a single capsule for each class and part, the proposed method employs several capsules (co-group capsules), capturing multiple prototypes of an object. In the final layer, co-group capsules compete, and their soft output is considered the target for a competitive cross-entropy loss. Moreover, in the middle layers, the most active capsules map to the next layer with a shared weight among the co-groups. Consequently, due to the reduction in parameters, implicit weight-sharing makes it possible to have more deep capsule network layers. The experimental results on MNIST, SVHN, C-Cube, CEDAR, MCYT, and UTSig datasets reveal that the proposed model outperforms others regarding image classification accuracy.
Abstract (translated)
胶囊网络是一种类型的神经网络,用于识别图像部分并构建整体层次结构。该网络的目标是执行反向计算机图形任务,网络参数是转换部分为整体的部分映射权重。在复杂数据中,胶囊网络的可训练性具有挑战性。本文提出了一个指导胶囊网络表示图像部分变化的多原型架构。为此,我们采用了几个胶囊(共同组胶囊),捕捉了对象多个原型。在最后一层,共同组胶囊竞争,其软输出被认为是竞争交叉熵损失的目标。此外,在中间层,最活跃的胶囊映射到下一个层,共享权重在共同组之间。因此,由于参数减少,隐含权重共享使得具有更多的深胶囊网络层成为可能。在MNIST、SVHN、C-Cube、CEDAR、MCYT和UTSig等数据集的实验结果表明,与现有模型相比,所提出的模型在图像分类准确性方面表现优异。
URL
https://arxiv.org/abs/2404.15445