Contrastive Out-of-Distribution Detection for Pretrained Transformers

Abstract
Abstract (translated)
URL
PDF

Abstract

Pretrained transformers achieve remarkable performance when the test data follows the same distribution as the training data. However, in real-world NLU tasks, the model often faces out-of-distribution (OoD) instances. Such instances can cause the severe semantic shift problem to inference, hence they are supposed to be identified and rejected by the model. In this paper, we study the OoD detection problem for pretrained transformers using only in-distribution data in training. We observe that such instances can be found using the Mahalanobis distance in the penultimate layer. We further propose a contrastive loss that improves the compactness of representations, such that OoD instances can be better differentiated from in-distribution ones. Experiments on the GLUE benchmark demonstrate the effectiveness of the proposed methods.

Abstract (translated)

URL

https://arxiv.org/abs/2104.08812

PDF

https://arxiv.org/pdf/2104.08812.pdf

Contrastive Out-of-Distribution Detection for Pretrained Transformers

Abstract

Abstract (translated)

URL

PDF Copy

PDF