Abstract
Many classification models work poorly on short texts due to data sparsity. To address this issue, we propose topic memory networks for short text classification with a novel topic memory mechanism to encode latent topic representations indicative of class labels. Different from most prior work that focuses on extending features with external knowledge or pre-trained topics, our model jointly explores topic inference and text classification with memory networks in an end-to-end manner. Experimental results on four benchmark datasets show that our model outperforms state-of-the-art models on short text classification, meanwhile generates coherent topics.
Abstract (translated)
由于数据稀疏性,许多分类模型在短文本上表现不佳。为了解决这个问题,我们提出了用于短文本分类的主题存储器网络,其具有新颖的主题存储机制,以编码指示类标签的潜在主题表示。与大多数以外部知识或预训练主题扩展特征的先前工作不同,我们的模型以端到端的方式共同探讨了主题推理和内存网络的文本分类。四个基准数据集的实验结果表明,我们的模型在短文本分类方面优于最先进的模型,同时产生连贯的主题。
URL
https://arxiv.org/abs/1809.03664