Paper Reading AI Learner

A Federated Learning-Friendly Approach for Parameter-Efficient Fine-Tuning of SAM in 3D Segmentation

2024-07-31 16:48:06
Mothilal Asokan, Joseph Geo Benjamin, Mohammad Yaqub, Karthik Nandakumar

Abstract

Adapting foundation models for medical image analysis requires finetuning them on a considerable amount of data because of extreme distribution shifts between natural (source) data used for pretraining and medical (target) data. However, collecting task-specific medical data for such finetuning at a central location raises many privacy concerns. Although Federated learning (FL) provides an effective means for training on private decentralized data, communication costs in federating large foundation models can quickly become a significant bottleneck, impacting the solution's scalability. In this work, we address this problem of efficient communication while ensuring effective learning in FL by combining the strengths of Parameter-Efficient Fine-tuning (PEFT) with FL. Specifically, we study plug-and-play Low-Rank Adapters (LoRA) in a federated manner to adapt the Segment Anything Model (SAM) for 3D medical image segmentation. Unlike prior works that utilize LoRA and finetune the entire decoder, we critically analyze the contribution of each granular component of SAM on finetuning performance. Thus, we identify specific layers to be federated that are very efficient in terms of communication cost while producing on-par accuracy. Our experiments show that retaining the parameters of the SAM model (including most of the decoder) in their original state during adaptation is beneficial because fine-tuning on small datasets tends to distort the inherent capabilities of the underlying foundation model. On Fed-KiTS, our approach decreases communication cost (~48x) compared to full fine-tuning while increasing performance (~6% Dice score) in 3D segmentation tasks. Our approach performs similar to SAMed while achieving ~2.8x reduction in communication and parameters to be finetuned. We further validate our approach with experiments on Fed-IXI and Prostate MRI datasets.

Abstract (translated)

为了在医疗图像分析中调整基础模型,需要在大规模数据集中对其进行微调,因为预训练的自然数据(源数据)和医疗数据(目标数据)之间存在极大的分布差异。然而,在中心位置收集此类微调任务特定的医疗数据会引发许多隐私问题。尽管联邦学习(FL)为在私有去中心化数据上进行有效训练提供了一种有效方法,但分散大型基础模型训练过程中的通信成本可能会迅速成为解决方案的可扩展性的瓶颈。 在这项工作中,我们通过结合参数高效的微调(PEFT)和联邦学习(FL)来解决这一问题,从而实现有效的学习。具体来说,我们以联邦方式研究可插拔低秩适配器(LoRA)在3D医疗图像分割中的应用。与之前的工作不同,我们关键性地分析SAM模型的每个粒度组件对微调性能的贡献。因此,我们确定在微调过程中需要联邦的特定层,这些层在通信成本方面非常高效,同时在准确性上与原始SAM模型相当。 在Fed-KiTS上,我们的方法在保留SAM模型参数(包括大部分解码器)的原始状态下进行微调,与完全微调相比,通信成本降低了(~48x),同时在3D分割任务上的性能提高了(~6%的Dice分数)。我们的方法在SAM化的同时,实现了通信和参数微调的~2.8x reduction。我们还通过在Fed-IXI和Prostate MRI数据集上的实验来验证我们的方法。

URL

https://arxiv.org/abs/2407.21739

PDF

https://arxiv.org/pdf/2407.21739.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot