De-biasing Distantly Supervised Named Entity Recognition via Causal Intervention

2021-06-17 04:01:02

Wenkai Zhang, Hongyu Lin, Xianpei Han, Le Sun

arXiv_CL

Abstract
Abstract (translated)
URL
PDF

Abstract

Distant supervision tackles the data bottleneck in NER by automatically generating training instances via dictionary matching. Unfortunately, the learning of DS-NER is severely dictionary-biased, which suffers from spurious correlations and therefore undermines the effectiveness and the robustness of the learned models. In this paper, we fundamentally explain the dictionary bias via a Structural Causal Model (SCM), categorize the bias into intra-dictionary and inter-dictionary biases, and identify their causes. Based on the SCM, we learn de-biased DS-NER via causal interventions. For intra-dictionary bias, we conduct backdoor adjustment to remove the spurious correlations introduced by the dictionary confounder. For inter-dictionary bias, we propose a causal invariance regularizer which will make DS-NER models more robust to the perturbation of dictionaries. Experiments on four datasets and three DS-NER models show that our method can significantly improve the performance of DS-NER.

Abstract (translated)

URL

https://arxiv.org/abs/2106.09233

PDF

https://arxiv.org/pdf/2106.09233.pdf