Paper Reading AI Learner

Suffix Bidirectional Long Short-Term Memory


Abstract

Recurrent neural networks have become ubiquitous in computing representations of sequential data, especially textual data in natural language processing. In particular, Bidirectional LSTMs are at the heart of several neural models achieving state-of-the-art performance in a wide variety of tasks in NLP. We propose a general and effective improvement to the BiLSTM model which encodes each suffix and prefix of a sequence of tokens in both forward and reverse directions. We call our model Suffix BiLSTM or SuBiLSTM. Using an extensive set of experiments, we demonstrate that using SuBiLSTM instead of a BiLSTM in existing base models leads to improvements in performance in learning general sentence representations, text classification, textual entailment and named entity recognition. We achieve new state-of-the-art results for fine-grained sentiment classification and question classification using SuBiLSTM.

Abstract (translated)

递归神经网络已经变得普遍用于计算顺序数据的表示,尤其是自然语言处理中的文本数据。特别是,双向LSTM是几个神经模型的核心,在NLP中的各种任务中实现了最先进的性能。我们提出对BiLSTM模型的一般和有效改进,该模型在前向和反向两个方向上编码每个后缀和一系列令牌的前缀。我们将模型称为Suffix BiLSTM或SuBiLSTM。使用一组广泛的实验,我们证明在现有基础模型中使用SuBiLSTM而不是BiLSTM可以提高学习一般句子表示,文本分类,文本蕴涵和命名实体识别的性能。我们使用SuBiLSTM实现细粒度情感分类和问题分类的最新结果。

URL

https://arxiv.org/abs/1805.07340

PDF

https://arxiv.org/pdf/1805.07340.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot