Paper Reading AI Learner

MultiConfederated Learning: Inclusive Non-IID Data handling with Decentralized Federated Learning

2024-04-20 16:38:26
Michael Duchesne, Kaiwen Zhang, Chamseddine Talhi

Abstract

Federated Learning (FL) has emerged as a prominent privacy-preserving technique for enabling use cases like confidential clinical machine learning. FL operates by aggregating models trained by remote devices which owns the data. Thus, FL enables the training of powerful global models using crowd-sourced data from a large number of learners, without compromising their privacy. However, the aggregating server is a single point of failure when generating the global model. Moreover, the performance of the model suffers when the data is not independent and identically distributed (non-IID data) on all remote devices. This leads to vastly different models being aggregated, which can reduce the performance by as much as 50% in certain scenarios. In this paper, we seek to address the aforementioned issues while retaining the benefits of FL. We propose MultiConfederated Learning: a decentralized FL framework which is designed to handle non-IID data. Unlike traditional FL, MultiConfederated Learning will maintain multiple models in parallel (instead of a single global model) to help with convergence when the data is non-IID. With the help of transfer learning, learners can converge to fewer models. In order to increase adaptability, learners are allowed to choose which updates to aggregate from their peers.

Abstract (translated)

联邦学习(FL)已成为一种突出隐私保护的技术,用于实现诸如机密临床机器学习等隐私 preserving 用例。FL通过汇总由远程设备训练的模型来操作,这些设备拥有数据。因此,FL允许大规模学习者使用 crowd-sourced 数据训练强大的全局模型,同时不损害他们的隐私。然而,在生成全局模型时,聚合服务器是一个单点故障。此外,当数据不独立且分布不同时(非 IID 数据)在所有远程设备上训练模型时,模型的性能会受到影响。这导致聚合的不同模型,在某些场景下可能导致性能降低 50%。在本文中,我们试图解决上述问题,同时保留 FL 的优点。我们提出了 MultiConfederated Learning:一个设计用于处理非 IID 数据的联邦 FL 框架。与传统 FL 不同,MultiConfederated Learning 将多个模型并行(而不是单全球模型)以帮助数据为非 IID 时达到收敛。通过迁移学习,学习者可以 convergence 到更少的模型。为了增加适应性,学习者被允许从同伴中选择要聚合的更新。

URL

https://arxiv.org/abs/2404.13421

PDF

https://arxiv.org/pdf/2404.13421.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot