Paper Reading AI Learner

Leap-LSTM: Enhancing Long Short-Term Memory for Text Categorization

2019-05-28 01:15:11
Ting Huang, Gehui Shen, Zhi-Hong Deng

Abstract

Recurrent Neural Networks (RNNs) are widely used in the field of natural language processing (NLP), ranging from text categorization to question answering and machine translation. However, RNNs generally read the whole text from beginning to end or vice versa sometimes, which makes it inefficient to process long texts. When reading a long document for a categorization task, such as topic categorization, large quantities of words are irrelevant and can be skipped. To this end, we propose Leap-LSTM, an LSTM-enhanced model which dynamically leaps between words while reading texts. At each step, we utilize several feature encoders to extract messages from preceding texts, following texts and the current word, and then determine whether to skip the current word. We evaluate Leap-LSTM on several text categorization tasks: sentiment analysis, news categorization, ontology classification and topic classification, with five benchmark data sets. The experimental results show that our model reads faster and predicts better than standard LSTM. Compared to previous models which can also skip words, our model achieves better trade-offs between performance and efficiency.

Abstract (translated)

递归神经网络(RNN)广泛应用于自然语言处理(NLP)领域,从文本分类到问答和机器翻译。然而,RNN通常从头到尾阅读整个文本,有时反之亦然,这使得处理长文本效率低下。当阅读分类任务(如主题分类)的长文档时,大量的单词是不相关的,可以跳过。为此,我们提出了LEAP LSTM,这是一个LSTM增强的模型,它在阅读文本时在单词之间动态跳跃。在每一步中,我们使用几个特征编码器从前面的文本、后面的文本和当前单词中提取消息,然后确定是否跳过当前单词。我们使用五个基准数据集对LeapLSTM进行了评估,包括情感分析、新闻分类、本体分类和主题分类。实验结果表明,该模型比标准LSTM具有更快的读取速度和更好的预测能力。与以前可以跳过单词的模型相比,我们的模型在性能和效率之间实现了更好的权衡。

URL

https://arxiv.org/abs/1905.11558

PDF

https://arxiv.org/pdf/1905.11558.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot