Abstract
Although pre-trained language models have exhibited great flexibility and versatility with prompt-based few-shot learning, they suffer from the extensive parameter size and limited applicability for inference. Recent studies have suggested that PLMs be used as dataset generators and a tiny task-specific model be trained to achieve efficient inference. However, their applicability to various domains is limited because they tend to generate domain-specific datasets. In this work, we propose a novel approach to universal domain generalization that generates a dataset regardless of the target domain. This allows for generalization of the tiny task model to any domain that shares the label space, thus enhancing the real-world applicability of the dataset generation paradigm. Our experiments indicate that the proposed method accomplishes generalizability across various domains while using a parameter set that is orders of magnitude smaller than PLMs.
Abstract (translated)
尽管预训练语言模型在基于提示的少样本学习方面表现出巨大的灵活性和多样性,但它们仍然受到参数数量庞大和推理应用受限的问题。最近的研究表明,PLM可以作为数据集生成器,并训练一个任务特定的微小模型来实现高效的推理。然而,由于它们倾向于生成特定领域的数据集,因此它们的适用性在各个领域都有限。在本文中,我们提出了一种通用的领域泛化方法,可以生成与目标领域不同的数据集。这使得将微小任务模型应用于任何具有共同标签空间的所有领域成为可能,从而提高了数据生成范式在现实世界的应用价值。我们的实验结果表明,与PLM相比,所提出的方法在各个领域都实现了泛化能力,同时使用了参数集 orders of magnitude 更小。
URL
https://arxiv.org/abs/2405.01022