Paper Reading AI Learner

Bayesian Federated Learning Via Expectation Maximization and Turbo Deep Approximate Message Passing

2024-02-12 01:47:06
Wei Xu, An Liu, Yiting Zhang, Vincent Lau

Abstract

Federated learning (FL) is a machine learning paradigm where the clients possess decentralized training data and the central server handles aggregation and scheduling. Typically, FL algorithms involve clients training their local models using stochastic gradient descent (SGD), which carries drawbacks such as slow convergence and being prone to getting stuck in suboptimal solutions. In this work, we propose a message passing based Bayesian federated learning (BFL) framework to avoid these drawbacks.Specifically, we formulate the problem of deep neural network (DNN) learning and compression and as a sparse Bayesian inference problem, in which group sparse prior is employed to achieve structured model compression. Then, we propose an efficient BFL algorithm called EMTDAMP, where expectation maximization (EM) and turbo deep approximate message passing (TDAMP) are combined to achieve distributed learning and compression. The central server aggregates local posterior distributions to update global posterior distributions and update hyperparameters based on EM to accelerate convergence. The clients perform TDAMP to achieve efficient approximate message passing over DNN with joint prior distribution. We detail the application of EMTDAMP to Boston housing price prediction and handwriting recognition, and present extensive numerical results to demonstrate the advantages of EMTDAMP.

Abstract (translated)

联邦学习(FL)是一种机器学习范式,其中客户端拥有分散的训练数据,而中央服务器负责聚合和调度。通常,FL算法涉及客户端使用随机梯度下降(SGD)训练本地模型,这具有诸如收敛缓慢和容易陷入次优解等问题。在本文中,我们提出了一个基于消息传递的基于贝叶斯联邦学习的(BFL)框架,以避免这些缺陷。 具体来说,我们将深度神经网络(DNN)学习和压缩问题以及稀疏贝叶斯推理问题建模为凸优化问题,其中采用组稀疏先验以实现结构化模型压缩。然后,我们提出了一种高效的BFL算法EMTDAMP,其中期望最大化(EM)和涡轮深度近似消息传递(TDAMP)被结合以实现分布式学习和压缩。中央服务器根据EM更新全局后验分布,并根据EM更新超参数以加速收敛。客户端进行TDAMP以实现DNN上高效近似消息传递。 我们详细介绍了EMTDAMP在波士顿住房价格预测和手写识别中的应用,并提供了广泛的数值结果以证明EMTDAMP的优势。

URL

https://arxiv.org/abs/2402.07366

PDF

https://arxiv.org/pdf/2402.07366.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot