Abstract
We present FastFit, a method, and a Python package design to provide fast and accurate few-shot classification, especially for scenarios with many semantically similar classes. FastFit utilizes a novel approach integrating batch contrastive learning and token-level similarity score. Compared to existing few-shot learning packages, such as SetFit, Transformers, or few-shot prompting of large language models via API calls, FastFit significantly improves multiclass classification performance in speed and accuracy across FewMany, our newly curated English benchmark, and Multilingual datasets. FastFit demonstrates a 3-20x improvement in training speed, completing training in just a few seconds. The FastFit package is now available on GitHub and PyPi, presenting a user-friendly solution for NLP practitioners.
Abstract (translated)
我们提出了FastFit方法和一个Python软件包设计,旨在提供快速和准确的零散shot分类,尤其是在具有许多相似语义类别的场景中。FastFit采用了一种新颖的方法,将批式对比学习与词级相似度分数相结合。与现有的零散shot学习软件包(如SetFit、Transformers或通过API调用的大语言模型的一小 shots提示)相比,FastFit在速度和准确性上显著提高了多分类分类性能。FastFit在FewMany、我们的新编英语基准和多语言数据集上的表现表明,其训练速度提高了3-20倍,训练时间仅需几秒钟。FastFit软件包现在可以在GitHub和PyPI上获得,为NLP从业者提供了一个易于使用的解决方案。
URL
https://arxiv.org/abs/2404.12365