Paper Reading AI Learner

Does Federated Learning Really Need Backpropagation?

2023-01-28 13:34:36
Haozhe Feng, Tianyu Pang, Chao Du, Wei Chen, Shuicheng Yan, Min Lin

Abstract

Federated learning (FL) is a general principle for decentralized clients to train a server model collectively without sharing local data. FL is a promising framework with practical applications, but its standard training paradigm requires the clients to backpropagate through the model to compute gradients. Since these clients are typically edge devices and not fully trusted, executing backpropagation on them incurs computational and storage overhead as well as white-box vulnerability. In light of this, we develop backpropagation-free federated learning, dubbed BAFFLE, in which backpropagation is replaced by multiple forward processes to estimate gradients. BAFFLE is 1) memory-efficient and easily fits uploading bandwidth; 2) compatible with inference-only hardware optimization and model quantization or pruning; and 3) well-suited to trusted execution environments, because the clients in BAFFLE only execute forward propagation and return a set of scalars to the server. Empirically we use BAFFLE to train deep models from scratch or to finetune pretrained models, achieving acceptable results. Code is available in this https URL.

Abstract (translated)

分布式学习(FL)是一个一般性原则,旨在分散化的客户端集体训练服务器模型,而无需共享本地数据。FL是一个有前途的框架,具有实际应用,但其标准培训范式要求客户端通过模型进行反向传播来计算梯度。由于这些客户端通常边缘设备,并不能完全信任它们,执行反向传播对这些客户端会产生计算和存储 overhead,以及白盒漏洞。鉴于这一点,我们开发了无反向传播的分布式学习,称为BAFFLE,其中反向传播被替代为多个向前进程来估计梯度。BAFFLE具有1)高效的内存利用率,可以轻松适应上传带宽;2)与只推理硬件优化和模型量化或修剪兼容;3)适用于信任执行环境,因为BAFFLE的客户端仅执行向前传播,并向服务器返回一组向量。经验证,我们使用BAFFLE从零开始训练深度模型,或微调已训练模型,取得了可以接受的结果。代码可在这个httpsURL上可用。

URL

https://arxiv.org/abs/2301.12195

PDF

https://arxiv.org/pdf/2301.12195.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot