Paper Reading AI Learner

A Framework for Double-Blind Federated Adaptation of Foundation Models

2025-02-03 12:00:11
Nurbek Tastan, Karthik Nandakumar

Abstract

The availability of foundational models (FMs) pre-trained on large-scale data has advanced the state-of-the-art in many computer vision tasks. While FMs have demonstrated good zero-shot performance on many image classification tasks, there is often scope for performance improvement by adapting the FM to the downstream task. However, the data that is required for this adaptation typically exists in silos across multiple entities (data owners) and cannot be collated at a central location due to regulations and privacy concerns. At the same time, a learning service provider (LSP) who owns the FM cannot share the model with the data owners due to proprietary reasons. In some cases, the data owners may not even have the resources to store such large FMs. Hence, there is a need for algorithms to adapt the FM in a double-blind federated manner, i.e., the data owners do not know the FM or each other's data, and the LSP does not see the data for the downstream tasks. In this work, we propose a framework for double-blind federated adaptation of FMs using fully homomorphic encryption (FHE). The proposed framework first decomposes the FM into a sequence of FHE-friendly blocks through knowledge distillation. The resulting FHE-friendly model is adapted for the downstream task via low-rank parallel adapters that can be learned without backpropagation through the FM. Since the proposed framework requires the LSP to share intermediate representations with the data owners, we design a privacy-preserving permutation scheme to prevent the data owners from learning the FM through model extraction attacks. Finally, a secure aggregation protocol is employed for federated learning of the low-rank parallel adapters. Empirical results on four datasets demonstrate the practical feasibility of the proposed framework.

Abstract (translated)

大规模数据预训练的基础模型(FMs)在许多计算机视觉任务中取得了最先进的成果。尽管基础模型在许多图像分类任务上表现出良好的零样本性能,但在将基础模型适应下游任务时通常仍有改进的空间。然而,用于这种适应的数据往往分散在多个实体(数据所有者)之间,并且由于监管和隐私问题而不能集中存储。同时,拥有基础模型的学习服务提供商(LSP)也无法出于专有原因与数据所有者分享该模型。此外,在某些情况下,数据所有者甚至没有足够的资源来存储如此庞大的基础模型。因此,需要一种算法以双盲联邦学习的方式适应基础模型,即数据所有者不知道基础模型或彼此的数据,而LSP也看不到用于下游任务的数据。 在本文中,我们提出了一种基于全同态加密(FHE)的框架,用于实现基础模型的双盲联邦适应。该框架首先通过知识蒸馏将基础模型分解为一系列适合FHE操作的模块。由此产生的FHE友好型模型可以通过低秩并行适配器进行下游任务调整,并且这些适配器可以在不需要通过基础模型反向传播的情况下被学习到。由于该框架需要LSP与数据所有者分享中间表示,我们设计了一种隐私保护置换方案以防止数据所有者通过模型提取攻击来了解基础模型。最后,采用了安全聚合协议来进行低秩并行适配器的联邦学习。 在四个数据集上的实验证明了所提出框架的实际可行性。

URL

https://arxiv.org/abs/2502.01289

PDF

https://arxiv.org/pdf/2502.01289.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot