Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules?

2018-08-28 21:44:26

Fréderic Godin, Kris Demuynck, Joni Dambre, Wesley De Neve, Thomas Demeester

arXiv_AI

arXiv_AI Segmentation CNN Memory_Networks Prediction Quantitative

Abstract
Abstract (translated)
URL
PDF

Abstract

Character-level features are currently used in different neural network-based natural language processing algorithms. However, little is known about the character-level patterns those models learn. Moreover, models are often compared only quantitatively while a qualitative analysis is missing. In this paper, we investigate which character-level patterns neural networks learn and if those patterns coincide with manually-defined word segmentations and annotations. To that end, we extend the contextual decomposition technique (Murdoch et al. 2018) to convolutional neural networks which allows us to compare convolutional neural networks and bidirectional long short-term memory networks. We evaluate and compare these models for the task of morphological tagging on three morphologically different languages and show that these models implicitly discover understandable linguistic rules. Our implementation can be found at https://github.com/FredericGodin/ContextualDecomposition-NLP .

Abstract (translated)

字符级特征目前用于不同的基于神经网络的自然语言处理算法。然而，对这些模型学习的角色级模式知之甚少。此外，模型通常只是定量比较，而缺少定性分析。在本文中，我们研究了神经网络学习哪些字符级模式，以及这些模式是否与手动定义的单词分割和注释一致。为此，我们将上下文分解技术（Murdoch等人，2018）扩展到卷积神经网络，这使我们能够比较卷积神经网络和双向长短期记忆网络。我们在三种形态不同的语言中评估和比较这些模型的形态标记任务，并表明这些模型隐含地发现了可理解的语言规则。我们的实现可以在https://github.com/FredericGodin/ContextualDecomposition-NLP找到。

URL

https://arxiv.org/abs/1808.09551

PDF

https://arxiv.org/pdf/1808.09551.pdf