Abstract
Anticipating audience reaction towards a certain text is integral to several facets of society ranging from politics, research, and commercial industries. Sentiment analysis (SA) is a useful natural language processing (NLP) technique that utilizes lexical/statistical and deep learning methods to determine whether different-sized texts exhibit positive, negative, or neutral emotions. Recurrent networks are widely used in machine-learning communities for problems with sequential data. However, a drawback of models based on Long-Short Term Memory networks and Gated Recurrent Units is the significantly high number of parameters, and thus, such models are computationally expensive. This drawback is even more significant when the available data are limited. Also, such models require significant over-parameterization and regularization to achieve optimal performance. Tensorized models represent a potential solution. In this paper, we classify the sentiment of some social media posts. We compare traditional recurrent models with their tensorized version, and we show that with the tensorized models, we reach comparable performances with respect to the traditional models while using fewer resources for the training.
Abstract (translated)
预测读者对某篇文章的反应是社会多个方面的重要组成部分,包括政治、研究以及商业行业。Sentiment analysis (SA) 是一种有用的自然语言处理技术,利用词汇/统计和深度学习方法来确定不同大小的文章是否表现出积极、消极或中性的情感。循环神经网络在机器学习社区中被广泛用于处理序列数据的问题。然而,基于长短期记忆网络和门控循环单元的模型的一个缺点是参数数量非常大,因此,这种模型的计算成本很高。当可用数据有限时,这个缺点变得更加显著。此外,这种模型需要进行大量的超参数化和规范化才能达到最佳性能。Tensor化模型代表了一种潜在解决方案。在本文中,我们对某些社交媒体帖子的情感分类进行 classification。我们比较了传统的循环模型及其Tensor化版本,并表明,使用Tensor化模型,我们可以与传统的模型达到类似的性能,同时使用更少的训练资源。
URL
https://arxiv.org/abs/2306.09705