A Universal Discriminator for Zero-Shot Generalization

2022-11-15 12:33:31

Haike Xu, Zongyu Lin, Jing Zhou, Yanan Zheng, Zhilin Yang

arXiv_CL

arXiv_CL GAN Zero-Shot

Abstract
Abstract (translated)
URL
PDF

Abstract

Generative modeling has been the dominant approach for large-scale pretraining and zero-shot generalization. In this work, we challenge this convention by showing that discriminative approaches perform substantially better than generative ones on a large number of NLP tasks. Technically, we train a single discriminator to predict whether a text sample comes from the true data distribution, similar to GANs. Since many NLP tasks can be formulated as selecting from a few options, we use this discriminator to predict the option with the highest probability. This simple formulation achieves state-of-the-art zero-shot results on the T0 benchmark, outperforming T0 by 16.0\%, 7.8\%, and 11.5\% respectively on different scales. In the finetuning setting, our approach also achieves new state-of-the-art results on a wide range of NLP tasks, with only 1/4 parameters of previous methods. Meanwhile, our approach requires minimal prompting efforts, which largely improves robustness and is essential for real-world applications. Furthermore, we also jointly train a generalized UD in combination with generative tasks, which maintains its advantage on discriminative tasks and simultaneously works on generative tasks.

Abstract (translated)

URL

https://arxiv.org/abs/2211.08099

PDF

https://arxiv.org/pdf/2211.08099.pdf