Abstract
Zero-shot grammatical error detection is the task of tagging token-level errors in a sentence when only given access to labels at the sentence-level for training. Recent work has explored attention- and gradient-based approaches for the task. We analyze a decomposition of a CNN trained as a sentence-level classifier, demonstrating zero-shot labeling effectiveness competitive with previously proposed bi-LSTM attention-based approaches. Interestingly, with the advantage of pre-trained contextualized embeddings, the approach is competitive with baseline (but no longer state-of-the-art) fully supervised bi-LSTM models (using standard pre-trained word embeddings), despite only having access to sentence-level labels for training. For reference, we also show that the basic approach extends to the fully supervised setting, yielding an error detection model as strong as the current state-of-the-art fully supervised approach with feature-based contextualized embeddings.
Abstract (translated)
URL
https://arxiv.org/abs/1906.01154