Paper Reading AI Learner

AB-Training: A Communication-Efficient Approach for Distributed Low-Rank Learning

2024-05-02 07:49:28
Daniel Coquelin, Katherina Flügel, Marie Weiel, Nicholas Kiefer, Muhammed Öz, Charlotte Debus, Achim Streit, Markus Götz

Abstract

Communication bottlenecks hinder the scalability of distributed neural network training, particularly on distributed-memory computing clusters. To significantly reduce this communication overhead, we introduce AB-training, a novel data-parallel training method that decomposes weight matrices into low-rank representations and utilizes independent group-based training. This approach consistently reduces network traffic by 50% across multiple scaling scenarios, increasing the training potential on communication-constrained systems. Our method exhibits regularization effects at smaller scales, leading to improved generalization for models like VGG16, while achieving a remarkable 44.14 : 1 compression ratio during training on CIFAR-10 and maintaining competitive accuracy. Albeit promising, our experiments reveal that large batch effects remain a challenge even in low-rank training regimes.

Abstract (translated)

通信瓶颈阻碍了分布式神经网络训练的可扩展性,特别是在分布式内存计算集群上。为了显著降低这种通信开销,我们引入了AB-训练,一种新颖的数据并行训练方法,将权重矩阵分解为低秩表示,并利用独立组基训练。这种方法在多个缩放场景下,将网络流量减少50%,增加了在通信受限系统上的训练潜力。我们的方法在较小的尺度上表现出正则化效应,使得像VGG16这样的模型具有更好的泛化能力,而在CIFAR-10上训练时,取得了惊人的44.14 : 1的压缩比,保持竞争力的准确性。尽管前景看好,但我们的实验表明,在低秩训练环境中,大批效应仍然是一个挑战。

URL

https://arxiv.org/abs/2405.01067

PDF

https://arxiv.org/pdf/2405.01067.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot