Abstract
In this paper, we introduce a novel method to interpret recurrent neural networks (RNNs), particularly long short-term memory networks (LSTMs) at the cellular level. We propose a systematic pipeline for interpreting individual hidden state dynamics within the network using response characterization methods. The ranked contribution of individual cells to the network's output is computed by analyzing a set of interpretable metrics of their decoupled step and sinusoidal responses. As a result, our method is able to uniquely identify neurons with insightful dynamics, quantify relationships between dynamical properties and test accuracy through ablation analysis, and interpret the impact of network capacity on a network's dynamical distribution. Finally, we demonstrate generalizability and scalability of our method by evaluating a series of different benchmark sequential datasets.
Abstract (translated)
在本文中,我们介绍了一种新的方法来解释复发神经网络(RNNs),特别是细胞水平的长期短期记忆网络(LSTMs)。我们提出了一种系统管道,用于使用响应表征方法解释网络中的个体隐藏状态动态。通过分析其解耦步骤和正弦响应的一组可解释度量来计算各个单元对网络输出的排序贡献。因此,我们的方法能够通过富有洞察力的动态来唯一地识别神经元,通过消融分析量化动态特性和测试准确度之间的关系,并解释网络容量对网络动态分布的影响。最后,我们通过评估一系列不同的基准序列数据集来证明我们方法的可普遍性和可扩展性。
URL
https://arxiv.org/abs/1809.03864