A simple theory for training response of deep neural networks

Abstract
Abstract (translated)
URL
PDF

Abstract

Deep neural networks give us a powerful method to model the training dataset's relationship between input and output. We can regard that as a complex adaptive system consisting of many artificial neurons that work as an adaptive memory as a whole. The network's behavior is training dynamics with a feedback loop from the evaluation of the loss function. We already know the training response can be constant or shows power law-like aging in some ideal situations. However, we still have gaps between those findings and other complex phenomena, like network fragility. To fill the gap, we introduce a very simple network and analyze it. We show the training response consists of some different factors based on training stages, activation functions, or training methods. In addition, we show feature space reduction as an effect of stochastic training dynamics, which can result in network fragility. Finally, we discuss some complex phenomena of deep networks.

Abstract (translated)

深度神经网络给我们了一种强大的方法来建模训练数据输入和输出之间的关系。我们可以将这看作是一个由许多人工神经元组成的复杂适应系统，作为一个整体，这些神经元表现出一种自适应记忆的特性。网络的行为是训练动态，通过损失函数的评估反馈循环。我们已知培训响应可以是常数，或者在某些理想情况下表现出类似于功率定律的老化。然而，我们仍然存在在那些发现和其它复杂现象之间的一些空白，比如网络的脆弱性。为了填补这个空白，我们引入了一个非常简单的网络，并对其进行分析。我们展示了培训响应取决于训练阶段、激活函数或训练方法的不同因素。此外，我们还展示了随机训练动态对特征空间缩减的影响，这可能导致网络脆弱性。最后，我们讨论了一些关于深度网络的复杂现象。

URL

https://arxiv.org/abs/2405.04074

PDF

https://arxiv.org/pdf/2405.04074.pdf

A simple theory for training response of deep neural networks

Abstract

Abstract (translated)

URL

PDF Copy

PDF