Not to Overfit or Underfit? A Study of Domain Generalization in Question Answering

2022-05-15 10:53:40

Md Arafat Sultan, Avirup Sil, Radu Florian

arXiv_CL

arXiv_CL Knowledge Zero-Shot

Abstract
Abstract (translated)
URL
PDF

Abstract

Machine learning models are prone to overfitting their source (training) distributions, which is commonly believed to be why they falter in novel target domains. Here we examine the contrasting view that multi-source domain generalization (DG) is in fact a problem of mitigating source domain underfitting: models not adequately learning the signal in their multi-domain training data. Experiments on a reading comprehension DG benchmark show that as a model gradually learns its source domains better -- using known methods such as knowledge distillation from a larger model -- its zero-shot out-of-domain accuracy improves at an even faster rate. Improved source domain learning also demonstrates superior generalization over three popular domain-invariant learning methods that aim to counter overfitting.

Abstract (translated)

URL

https://arxiv.org/abs/2205.07257

PDF

https://arxiv.org/pdf/2205.07257.pdf