From Balustrades to Pierre Vinken: Looking for Syntax in Transformer Self-Attentions

Abstract
Abstract (translated)
URL
PDF

Abstract

We inspect the multi-head self-attention in Transformer NMT encoders for three source languages, looking for patterns that could have a syntactic interpretation. In many of the attention heads, we frequently find sequences of consecutive states attending to the same position, which resemble syntactic phrases. We propose a transparent deterministic method of quantifying the amount of syntactic information present in the self-attentions, based on automatically building and evaluating phrase-structure trees from the phrase-like sequences. We compare the resulting trees to existing constituency treebanks, both manually and by computing precision and recall.

Abstract (translated)

我们检查了三种源语言的Transformer NMT编码器中的多头自我关注，寻找可能具有句法解释的模式。在许多注意事项中，我们经常发现连续的状态序列与同一位置有关，这类似于句法短语。我们提出了一种透明的确定性方法，通过从类短语序列中自动构建和评估短语结构树，量化自我关注中的句法信息量。我们将得到的树与现有选区树链接进行比较，包括手动、计算精度和召回。

URL

https://arxiv.org/abs/1906.01958

PDF

https://arxiv.org/pdf/1906.01958.pdf