Hard ASH: Sparsity and the right optimizer make a continual learner

Abstract
Abstract (translated)
URL
PDF

Abstract

In class incremental learning, neural networks typically suffer from catastrophic forgetting. We show that an MLP featuring a sparse activation function and an adaptive learning rate optimizer can compete with established regularization techniques in the Split-MNIST task. We highlight the effectiveness of the Adaptive SwisH (ASH) activation function in this context and introduce a novel variant, Hard Adaptive SwisH (Hard ASH) to further enhance the learning retention.

Abstract (translated)

在班级增量学习过程中，神经网络通常会遭受灾难性遗忘。我们证明了具有稀疏激活函数和自适应学习率优化器的MLP可以在分割MNIST任务中与已有的正则化技术竞争。我们突出了在這種情況下Adaptive SwisH（ASH）激活函数的有效性，并引入了一种新的变体，即硬自适应SwisH（Hard ASH），以进一步增强学习保留。

URL

https://arxiv.org/abs/2404.17651

PDF

https://arxiv.org/pdf/2404.17651.pdf

Hard ASH: Sparsity and the right optimizer make a continual learner

Abstract

Abstract (translated)

URL

PDF Copy

PDF