Paper Reading AI Learner

FREE: Faster and Better Data-Free Meta-Learning

2024-05-02 03:43:19
Yongxian Wei, Zixuan Hu, Zhenyi Wang, Li Shen, Chun Yuan, Dacheng Tao

Abstract

Data-Free Meta-Learning (DFML) aims to extract knowledge from a collection of pre-trained models without requiring the original data, presenting practical benefits in contexts constrained by data privacy concerns. Current DFML methods primarily focus on the data recovery from these pre-trained models. However, they suffer from slow recovery speed and overlook gaps inherent in heterogeneous pre-trained models. In response to these challenges, we introduce the Faster and Better Data-Free Meta-Learning (FREE) framework, which contains: (i) a meta-generator for rapidly recovering training tasks from pre-trained models; and (ii) a meta-learner for generalizing to new unseen tasks. Specifically, within the module Faster Inversion via Meta-Generator, each pre-trained model is perceived as a distinct task. The meta-generator can rapidly adapt to a specific task in just five steps, significantly accelerating the data recovery. Furthermore, we propose Better Generalization via Meta-Learner and introduce an implicit gradient alignment algorithm to optimize the meta-learner. This is achieved as aligned gradient directions alleviate potential conflicts among tasks from heterogeneous pre-trained models. Empirical experiments on multiple benchmarks affirm the superiority of our approach, marking a notable speed-up (20$\times$) and performance enhancement (1.42\% $\sim$ 4.78\%) in comparison to the state-of-the-art.

Abstract (translated)

数据免费元学习(DFML)旨在从一组预训练模型中提取知识,而无需要求原始数据,在受到数据隐私问题限制的上下文中具有实际应用价值。当前的DFML方法主要集中在从预训练模型中数据恢复。然而,它们受到缓慢的数据恢复速度和忽视预训练模型异质性的不足的缺陷。为了应对这些挑战,我们引入了更快的数据免费元学习(FREE)框架,它包含:(i)一个元生成器,用于从预训练模型中迅速恢复训练任务;(ii)一个元学习器,用于对新预见到的任务进行泛化。具体来说,在Faster Inversion via Meta-Generator模块中,每个预训练模型都被视为一个独立任务。元生成器可以在只需五个步骤的情况下迅速适应特定任务,大大加快了数据恢复速度。此外,我们提出了更好的泛化通过元学习器,并引入了隐含梯度对齐算法来优化元学习器。通过对齐梯度方向消除了异质预训练模型任务之间的潜在冲突,使得我们的方法在多个基准上的实验结果证实了其优越性,与最先进的方法相比,速度更快(20$\times$),性能更卓越(1.42\% $\sim$ 4.78\%)。

URL

https://arxiv.org/abs/2405.00984

PDF

https://arxiv.org/pdf/2405.00984.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot