Abstract
Deep neural networks (DNNs) are often trained on the premise that the complete training data set is provided ahead of time. However, in real-world scenarios, data often arrive in chunks over time. This leads to important considerations about the optimal strategy for training DNNs, such as whether to fine-tune them with each chunk of incoming data (warm-start) or to retrain them from scratch with the entire corpus of data whenever a new chunk is available. While employing the latter for training can be resource-intensive, recent work has pointed out the lack of generalization in warm-start models. Therefore, to strike a balance between efficiency and generalization, we introduce Learn, Unlearn, and Relearn (LURE) an online learning paradigm for DNNs. LURE interchanges between the unlearning phase, which selectively forgets the undesirable information in the model through weight reinitialization in a data-dependent manner, and the relearning phase, which emphasizes learning on generalizable features. We show that our training paradigm provides consistent performance gains across datasets in both classification and few-shot settings. We further show that it leads to more robust and well-calibrated models.
Abstract (translated)
深度神经网络(DNN)通常的训练前提是提供完整的训练数据集。然而,在现实场景中,数据通常会随着时间的推移以块的形式出现。这导致对于训练DNN的最佳策略的重要考虑,例如是否需要对每个 incoming 数据块进行微调(温启动)或者是否需要在任何时候使用整个数据集块重新训练DNN。虽然使用后者进行训练可以耗费更多的资源,但最近的研究表明温启动模型缺乏泛化能力。因此,为了在效率和泛化之间取得平衡,我们提出了学习、遗忘和再学习(Lure)的DNN在线学习范式。Lure在遗忘阶段和再学习阶段之间交替进行,通过数据依赖性权重初始化选择性地遗忘模型中的不希望记得的信息。我们证明,我们的训练范式在分类和少量样本设置中提供一致性的性能增益。我们还证明,它导致更加稳健和校准的模型。
URL
https://arxiv.org/abs/2303.10455