Abstract
In class incremental learning, neural networks typically suffer from catastrophic forgetting. We show that an MLP featuring a sparse activation function and an adaptive learning rate optimizer can compete with established regularization techniques in the Split-MNIST task. We highlight the effectiveness of the Adaptive SwisH (ASH) activation function in this context and introduce a novel variant, Hard Adaptive SwisH (Hard ASH) to further enhance the learning retention.
Abstract (translated)
在班级增量学习过程中,神经网络通常会遭受灾难性遗忘。我们证明了具有稀疏激活函数和自适应学习率优化器的MLP可以在分割MNIST任务中与已有的正则化技术竞争。我们突出了在這種情況下Adaptive SwisH(ASH)激活函数的有效性,并引入了一种新的变体,即硬自适应SwisH(Hard ASH),以进一步增强学习保留。
URL
https://arxiv.org/abs/2404.17651