Paper Reading AI Learner

Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules?

2018-08-28 21:44:26
Fréderic Godin, Kris Demuynck, Joni Dambre, Wesley De Neve, Thomas Demeester

Abstract

Character-level features are currently used in different neural network-based natural language processing algorithms. However, little is known about the character-level patterns those models learn. Moreover, models are often compared only quantitatively while a qualitative analysis is missing. In this paper, we investigate which character-level patterns neural networks learn and if those patterns coincide with manually-defined word segmentations and annotations. To that end, we extend the contextual decomposition technique (Murdoch et al. 2018) to convolutional neural networks which allows us to compare convolutional neural networks and bidirectional long short-term memory networks. We evaluate and compare these models for the task of morphological tagging on three morphologically different languages and show that these models implicitly discover understandable linguistic rules. Our implementation can be found at https://github.com/FredericGodin/ContextualDecomposition-NLP .

Abstract (translated)

字符级特征目前用于不同的基于神经网络的自然语言处理算法。然而,对这些模型学习的角色级模式知之甚少。此外,模型通常只是定量比较,而缺少定性分析。在本文中,我们研究了神经网络学习哪些字符级模式,以及这些模式是否与手动定义的单词分割和注释一致。为此,我们将上下文分解技术(Murdoch等人,2018)扩展到卷积神经网络,这使我们能够比较卷积神经网络和双向长短期记忆网络。我们在三种形态不同的语言中评估和比较这些模型的形态标记任务,并表明这些模型隐含地发现了可理解的语言规则。我们的实现可以在https://github.com/FredericGodin/ContextualDecomposition-NLP找到。

URL

https://arxiv.org/abs/1808.09551

PDF

https://arxiv.org/pdf/1808.09551.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot