Paper Reading AI Learner

FlyLoRA: Boosting Task Decoupling and Parameter Efficiency via Implicit Rank-Wise Mixture-of-Experts

2025-10-09 16:17:13
Heming Zou, Yunliang Zang, Wutong Xu, Yao Zhu, Xiangyang Ji

Abstract

Low-Rank Adaptation (LoRA) is a widely used parameter-efficient fine-tuning method for foundation models, but it suffers from parameter interference, resulting in suboptimal performance. Although Mixture-of-Experts (MoE)-based LoRA variants show promise in mitigating intra-task correlations in single-task instruction tuning, they introduce additional router parameters and remain ineffective in multi-task model merging where inter-task interference arises. Inspired by the fly olfactory circuit, we propose FlyLoRA, an implicit MoE-based LoRA variant that introduces: (1) rank-wise expert activation in the up-projection matrix, and (2) an implicit router that unifies expert routing and down-projection, where a frozen sparse random projection matrix replaces the traditional dense trainable version. This design resolves the trade-off between intra-task decorrelation and computational efficiency by eliminating the need for an explicit router, while inherently mitigating inter-task interference due to the orthogonality property of random matrices. Extensive experiments across four domains -- general knowledge understanding, scientific question answering, mathematical reasoning, and code generation -- demonstrate consistent performance improvements over existing methods. Beyond empirical gains, FlyLoRA highlights how biological structures can inspire innovations in AI technologies. Code is available at this https URL.

Abstract (translated)

低秩适应(LoRA)是一种广泛应用于基础模型的参数高效微调方法,但它存在参数干扰问题,导致性能不佳。尽管基于专家混合(MoE)的LoRA变体在单任务指令微调中表现出减少任务内部相关性的潜力,它们引入了额外的路由器参数,并且在多任务模型合并时,由于任务间干扰依然无效。受果蝇嗅觉回路启发,我们提出了一种名为FlyLoRA的新方法,这是一种隐式的基于MoE的LoRA变体,它引入了: 1. 在上投影矩阵中按秩激活专家。 2. 一个隐式路由器统一了专家路由和下投影操作,其中冻结的稀疏随机投影矩阵替代了传统的密集可训练版本。 这种设计通过消除对显式路由器的需求,在任务内部去相关性和计算效率之间达成了权衡,并且由于随机矩阵的正交性属性,FlyLoRA本质上缓解了任务间的干扰。在四个领域——通用知识理解、科学问题解答、数学推理和代码生成中的广泛实验表明,与现有方法相比,FlyLoRA能够提供一致性的性能提升。 除了经验上的好处外,FlyLoRA还展示了生物结构如何启发AI技术的创新。代码可在该链接获取:[请在此处插入具体URL]。

URL

https://arxiv.org/abs/2510.08396

PDF

https://arxiv.org/pdf/2510.08396.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot