Abstract
This paper proposes a new approach to Machine Learning (ML) that focuses on unsupervised continuous context-dependent learning of complex patterns. Although the proposal is partly inspired by some of the current knowledge about the structural and functional properties of the mammalian brain, we do not claim that biological systems work in an analogous way (nor the opposite). Based on some properties of the cerebellar cortex and adjacent structures, a proposal suitable for practical problems is presented. A synthetic structure capable of identifying and predicting complex temporal series will be defined and experimentally tested. The system relies heavily on prediction to help identify and learn patterns based on previously acquired contextual knowledge. As a proof of concept, the proposed system is shown to be able to learn, identify and predict a remarkably complex temporal series such as human speech, with no prior knowledge. From raw data, without any adaptation in the core algorithm, the system is able to identify certain speech structures from a set of Spanish sentences. Unlike conventional ML, the proposal can learn with a reduced training set. Although the idea can be applied to a constrained problem, such as the detection of unknown vocabulary in a speech, it could be used in more applications, such as vision, or (by incorporating the missing biological periphery) fit into other ML techniques. Given the trivial computational primitives used, a potential hardware implementation will be remarkably frugal. Coincidentally, the proposed model not only conforms to a plausible functional framework for biological systems but may also explain many elusive cognitive phenomena.
Abstract (translated)
本文提出了一种新的机器学习(ML)方法,重点关注无监督的连续上下文相关学习复杂模式。尽管在某种程度上,这个建议受到了哺乳动物大脑的结构和功能性质的一些现有知识的影响,但我们不声称生物系统以类似的方式运行(也不相反)。基于小脑皮质和相邻结构的一些性质,提出了一种适用于实际问题的建议。将定义一个能够识别和预测复杂时间序列的合成结构,并进行实验验证。系统依赖于预测来帮助根据先前的上下文知识识别和学习模式。作为一种概念证明,所提出的系统能够学习、识别并预测类似于人类语音的复杂时间序列,而无需任何先验知识。从原始数据中,无需对核心算法进行调整,系统能够从一系列西班牙句子中识别出某些语音结构。与传统的ML不同,这个建议可以在较小的训练集上学习。尽管这个想法可以应用于一些约束性的问题,如在语音中检测未知词汇,但它也可以应用于更多的应用场景,如视觉,或者(通过融入缺失的生物外围系统)与其他ML技术相融合。考虑到所使用的简单计算原语,这个潜在的硬件实现将非常节俭。巧合的是,所提出的模型不仅符合生物系统的合理功能框架,而且还可以解释许多令人困惑的认知现象。
URL
https://arxiv.org/abs/2405.02371