BERTering RAMS: What and How Much does BERT Already Know About Event Arguments? -- A Study on the RAMS Dataset

Abstract
Abstract (translated)
URL
PDF

Abstract

Using the attention map based probing framework from (Clark et al., 2019), we observe that BERT's attention heads have modest but well above-chance ability to spot event arguments sans any training or domain finetuning, varying from a low of 17.77% for Place to a high of 51.61% for Artifact. Next, we find that linear combinations of these heads, estimated with approx 11% of available total supervision, can push performance well-higher for some roles - highest two being Victim (68.29% Accuracy) and Artifact (58.82% Accuracy). Furthermore, we investigate how well our methods do for cross-sentence event arguments. We propose a procedure to isolate "best heads" for cross-sentence argument detection separately of those for intra-sentence arguments. The heads thus estimated have superior cross-sentence performance compared to their jointly estimated equivalents. Lastly, we seek to isolate to what extent our numbers stem from lexical frequency based associations between gold arguments and roles. We propose NONCE, a scheme to create adversarial test examples by replacing gold arguments with randomly generated "nonce" words. We find that learnt linear combinations are robust to NONCE, though individual best heads can be much more sensitive.

Abstract (translated)

URL

https://arxiv.org/abs/2010.04098

PDF

https://arxiv.org/pdf/2010.04098.pdf