Paper Reading AI Learner

Reinforcement Learning with Generative Models for Compact Support Sets

2024-04-25 02:48:16
Nico Schiavone, Xingyu Li

Abstract

Foundation models contain a wealth of information from their vast number of training samples. However, most prior arts fail to extract this information in a precise and efficient way for small sample sizes. In this work, we propose a framework utilizing reinforcement learning as a control for foundation models, allowing for the granular generation of small, focused synthetic support sets to augment the performance of neural network models on real data classification tasks. We first allow a reinforcement learning agent access to a novel context based dictionary; the agent then uses this dictionary with a novel prompt structure to form and optimize prompts as inputs to generative models, receiving feedback based on a reward function combining the change in validation accuracy and entropy. A support set is formed this way over several exploration steps. Our framework produced excellent results, increasing classification accuracy by significant margins for no additional labelling or data cost.

Abstract (translated)

基础模型包含大量训练样本中所学到的丰富信息。然而,大多数先前的艺术作品在小型样本量的情况下无法精确有效地提取这些信息。在本文中,我们提出了一种利用强化学习作为基础模型控制的方法,允许在小型、关注点状的合成支持集上生成细粒度的支持集,以提高神经网络模型在真实数据分类任务上的性能。我们首先允许一个强化学习代理访问一个新颖的上下文基词表;然后,代理使用此基词表与新颖的提示结构形成和优化提示作为输入,根据基于验证准确性和熵的奖励函数接收反馈。通过几次探索步骤,这样就可以形成一个支持集。我们的框架产生了很好的结果,在不需要额外标签或数据成本的情况下,将分类准确度提高了显著的幅度。

URL

https://arxiv.org/abs/2404.16300

PDF

https://arxiv.org/pdf/2404.16300.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot