Abstract
This paper presents a theoretical framework explaining why fine tuning small, randomly selected subnetworks (slices) within pre trained models can be sufficient for downstream adaptation. We prove that pretrained networks exhibit a universal winning slice property arising from two phenomena: (1) spectral balance the eigenspectra of different weight matrix slices are remarkably similar; and (2) high task energy their backbone representations retain rich, task relevant features. This leads to the Universal Winning Slice Hypothesis, which provides a theoretical foundation for parameter efficient fine tuning (PEFT) in large scale models. Inspired by this, we propose SliceFine, a PEFT method that exploits this inherent redundancy by updating only selected slices of the original weights introducing zero new parameters, unlike adapter-based approaches. Empirically, SliceFine matches the performance of state of the art PEFT methods across language and vision tasks, while significantly improving training speed, memory efficiency, and model compactness. Our work bridges theory and practice, offering a theoretically grounded alternative to existing PEFT techniques.
Abstract (translated)
本文提出了一种理论框架,解释了为何对预训练模型中的小型、随机选择的子网络(切片)进行微调就足以实现下游任务的适应。我们证明了预训练网络表现出一种普遍获胜切片特性,这是由两种现象引起的:(1) 光谱平衡——不同权重矩阵切片的特征值光谱非常相似;和 (2) 高任务能量——骨干表示保留了丰富的、与任务相关的特征。这导致了“通用获胜切片假设”,为大规模模型中的参数高效微调(PEFT)提供了理论基础。受此启发,我们提出了SliceFine,这是一种PEFT方法,通过只更新原始权重的选定切片来利用这种内在冗余,并且不引入任何新参数,从而区别于基于适配器的方法。在实验上,SliceFine在语言和视觉任务中与最先进的PEFT方法的表现相匹配,同时显著提高了训练速度、内存效率和模型紧凑性。我们的工作将理论和实践结合起来,为现有的PEFT技术提供了一种具有理论依据的替代方案。
URL
https://arxiv.org/abs/2510.08513