Abstract
Gradient descent has proven to be a powerful and effective technique for optimization in numerous machine learning applications. Recent advances in computational neuroscience have shown that learning in standard gradient descent optimization formulation is not consistent with learning in biological systems. This has opened up interesting avenues for building biologically inspired learning techniques. One such approach is inspired by Dale's law, which states that inhibitory and excitatory synapses do not swap roles during the course of learning. The resulting exponential gradient descent optimization scheme leads to log-normally distributed synaptic weights. Interestingly, the density that satisfies the Fokker-Planck equation corresponding to the stochastic differential equation (SDE) with geometric Brownian motion (GBM) is the log-normal density. Leveraging this connection, we start with the SDE governing geometric Brownian motion, and show that discretizing the corresponding reverse-time SDE yields a multiplicative update rule, which surprisingly, coincides with the sampling equivalent of the exponential gradient descent update founded on Dale's law. Furthermore, we propose a new formalism for multiplicative denoising score-matching, subsuming the loss function proposed by Hyvaerinen for non-negative data. Indeed, log-normally distributed data is positive and the proposed score-matching formalism turns out to be a natural fit. This allows for training of score-based models for image data and results in a novel multiplicative update scheme for sample generation starting from a log-normal density. Experimental results on MNIST, Fashion MNIST, and Kuzushiji datasets demonstrate generative capability of the new scheme. To the best of our knowledge, this is the first instance of a biologically inspired generative model employing multiplicative updates, founded on geometric Brownian motion.
Abstract (translated)
梯度下降法已经被证明是众多机器学习应用中一种强大且有效的优化技术。然而,近年来计算神经科学的进展表明,在标准的梯度下降优化框架下所实现的学习与生物系统中的学习并不一致。这为构建受生物学启发的学习方法开辟了新的途径。其中一种方法受到达勒定律(Dale's law)的启发,该定律指出在学习过程中抑制性突触和兴奋性突触不会互换角色。由此产生的指数梯度下降优化方案导致突触权重呈对数正态分布。值得注意的是,与几何布朗运动(GBM)随机微分方程对应的Fokker-Planck方程的密度是对数正态密度。 利用这一联系,我们从描述几何布朗运动的随机微分方程开始,然后证明了相应反向时间随机微分方程离散化后得到一个乘法更新规则。令人惊讶的是,这个更新规则与基于达勒定律的指数梯度下降采样等效性一致。此外,我们还提出了一种新的乘法去噪评分匹配形式,它包含了Hyvaerinen提出的用于非负数据的损失函数。事实上,对数正态分布的数据为正值,并且所提的评分匹配方法正好适用。这使得基于得分模型训练图像数据成为可能,并产生了一个新颖的从对数正态密度开始进行样本生成的乘法更新方案。 在MNIST、Fashion MNIST和Kuzushiji等数据集上的实验结果表明,新方案具有良好的生成能力。据我们所知,这是首次将受生物启发的生成模型与基于几何布朗运动的乘法更新方法相结合的应用实例。
URL
https://arxiv.org/abs/2510.02730