Exploring Generalization Ability of Pretrained Language Models on Arithmetic and Logical Reasoning

2021-08-15 13:42:10

Cunxiang Wang, Boyuan Zheng, Yuchen Niu, Yue Zhang

arXiv_AI

arXiv_AI Language_Model Quantitative

Abstract
Abstract (translated)
URL
PDF

Abstract

To quantitatively and intuitively explore the generalization ability of pre-trained language models (PLMs), we have designed several tasks of arithmetic and logical reasoning. We both analyse how well PLMs generalize when the test data is in the same distribution as the train data and when it is different, for the latter analysis, we have also designed a cross-distribution test set other than the in-distribution test set. We conduct experiments on one of the most advanced and publicly released generative PLM - BART. Our research finds that the PLMs can easily generalize when the distribution is the same, however, it is still difficult for them to generalize out of the distribution.

Abstract (translated)

URL

https://arxiv.org/abs/2108.06743

PDF

https://arxiv.org/pdf/2108.06743.pdf