Paper Reading AI Learner

Differentiable Entailment for Parameter Efficient Few Shot Learning

2023-01-31 00:31:11
Ethan Kim, Jerry Yang

Abstract

Few-shot learning allows pre-trained language models to adapt to downstream tasks while using a limited number of training examples. However, practical applications are limited when all model parameters must be optimized. In this work we apply a new technique for parameter efficient few shot learning while adopting a strict definition of parameter efficiency. Our training method combines 1) intermediate training by reformulating natural language tasks as entailment tasks \cite{wang_entailment_2021} and 2) differentiable optimization of template and label tokens \cite{zhang_differentiable_2021}. We quantify the tradeoff between parameter efficiency and performance in the few-shot regime and propose a simple model agnostic approach that can be extended to any task By achieving competitive performance while only optimizing 3\% of a model's parameters and allowing for batched inference, we allow for more efficient practical deployment of models.

Abstract (translated)

少量学习允许预先训练的语言模型使用有限的训练示例适应后续任务。然而,当所有模型参数必须优化时,实际应用是有限的。在本工作中,我们应用了一种参数高效的少量学习技术,同时采用严格的参数效率定义。我们的训练方法包括将自然语言任务改写为暗示任务( cite{wang_entailment_2021} )和将模板和标签 token 的可区分优化( cite{zhang_differentiable_2021} )相结合。我们量化了参数效率和性能之间的权衡,并提出了一种模型无关的方法,可以扩展到任何任务。通过仅优化模型参数的3%,同时允许批量推理,我们允许更有效地在实践中部署模型。

URL

https://arxiv.org/abs/2301.13345

PDF

https://arxiv.org/pdf/2301.13345.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot