Extending Memory for Language Modelling

Abstract
Abstract (translated)
URL
PDF

Abstract

Breakthroughs in deep learning and memory networks have made major advances in natural language understanding. Language is sequential and information carried through the sequence can be captured through memory networks. Learning the sequence is one of the key aspects in learning the language. However, memory networks are not capable of holding infinitely long sequences in their memories and are limited by various constraints such as the vanishing or exploding gradient problem. Therefore, natural language understanding models are affected when presented with long sequential text. We introduce Long Term Memory network (LTM) to learn from infinitely long sequences. LTM gives priority to the current inputs to allow it to have a high impact. Language modeling is an important factor in natural language understanding. LTM was tested in language modeling, which requires long term memory. LTM is tested on Penn Tree bank dataset, Google Billion Word dataset and WikiText-2 dataset. We compare LTM with other language models which require long term memory.

Abstract (translated)

深度学习和记忆网络的发展在自然语言理解方面取得了重大进展。语言是序列化的，通过序列传递的信息可以被记忆网络捕获。学习序列是学习语言的关键方面之一。然而，记忆网络无法存储无限长序列，并受到各种限制，如消失或爆炸梯度问题。因此，当面对长序列文本时，自然语言理解模型会受到影响。我们引入了长期记忆网络(LTM)来从无限长序列中学习。LTM将当前输入优先级最高，以便能够产生高影响。语言建模是自然语言理解的一个重要因素。LTM在语言建模中进行了测试，需要长期记忆。LTM在宾夕法尼亚树数据库数据集、谷歌亿个单词数据集和维基百科文本-2数据集上进行了测试。我们将LTM与其他需要长期记忆的语言模型进行比较。

URL

https://arxiv.org/abs/2305.11462

PDF

https://arxiv.org/pdf/2305.11462.pdf

Extending Memory for Language Modelling

Abstract

Abstract (translated)

URL

PDF Copy

PDF