Paper Reading AI Learner

GLAI: GreenLightningAI for Accelerated Training through Knowledge Decoupling

2025-10-01 13:31:34
Jose I. Mestre, Alberto Fern\'andez-Hern\'andez, Cristian P\'erez-Corral, Manuel F. Dolz, Jose Duato, Enrique S. Quintana-Ort\'i

Abstract

In this work we introduce GreenLightningAI (GLAI), a new architectural block designed as an alternative to conventional MLPs. The central idea is to separate two types of knowledge that are usually entangled during training: (i) *structural knowledge*, encoded by the stable activation patterns induced by ReLU activations; and (ii) *quantitative knowledge*, carried by the numerical weights and biases. By fixing the structure once stabilized, GLAI reformulates the MLP as a combination of paths, where only the quantitative component is optimized. This reformulation retains the universal approximation capabilities of MLPs, yet achieves a more efficient training process, reducing training time by ~40% on average across the cases examined in this study. Crucially, GLAI is not just another classifier, but a generic block that can replace MLPs wherever they are used, from supervised heads with frozen backbones to projection layers in self-supervised learning or few-shot classifiers. Across diverse experimental setups, GLAI consistently matches or exceeds the accuracy of MLPs with an equivalent number of parameters, while converging faster. Overall, GLAI establishes a new design principle that opens a direction for future integration into large-scale architectures such as Transformers, where MLP blocks dominate the computational footprint.

Abstract (translated)

在这项工作中,我们引入了GreenLightningAI(GLAI),这是一个新的架构模块,旨在作为传统多层感知机(MLP)的替代方案。核心思想是将通常在训练过程中交织在一起的两种知识分离:(i) *结构化知识*,由ReLU激活所诱导的稳定激活模式编码;以及(ii) *量化知识*,由数值权重和偏差承载。 通过一次固定结构,在其被稳定之后,GLAI重新定义了MLP为路径组合的形式,其中仅优化量化部分。这种重构保留了MLP的通用逼近能力,但实现了更高效的训练过程,在本研究中考察的所有案例中平均减少了约40%的训练时间。 至关重要的是,GLAI不仅仅是一个分类器,而是一个可以在任何地方替代MLP的通用模块,从具有冻结骨干网络的监督头部到自监督学习中的投影层或少量样本分类器。在各种实验设置下,GLAI始终能够与拥有相同数量参数的MLP相匹配甚至超越其准确度,并且收敛速度更快。 总体而言,GLAI确立了一种新的设计原则,为未来将其整合到大型架构(如Transformer)中开辟了道路,在这些架构中,MLP模块占据了主要的计算开销。

URL

https://arxiv.org/abs/2510.00883

PDF

https://arxiv.org/pdf/2510.00883.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot