Processing math: 100%
Paper Reading AI Learner

AlphaGrad: Non-Linear Gradient Normalization Optimizer

2025-04-22 16:33:14
Soham Sane

Abstract

We introduce AlphaGrad, a memory-efficient, conditionally stateless optimizer addressing the memory overhead and hyperparameter complexity of adaptive methods like Adam. AlphaGrad enforces scale invariance via tensor-wise L2 gradient normalization followed by a smooth hyperbolic tangent transformation, g=tanh(α˜g), controlled by a single steepness parameter α. Our contributions include: (1) the AlphaGrad algorithm formulation; (2) a formal non-convex convergence analysis guaranteeing stationarity; (3) extensive empirical evaluation on diverse RL benchmarks (DQN, TD3, PPO). Compared to Adam, AlphaGrad demonstrates a highly context-dependent performance profile. While exhibiting instability in off-policy DQN, it provides enhanced training stability with competitive results in TD3 (requiring careful α tuning) and achieves substantially superior performance in on-policy PPO. These results underscore the critical importance of empirical α selection, revealing strong interactions between the optimizer's dynamics and the underlying RL algorithm. AlphaGrad presents a compelling alternative optimizer for memory-constrained scenarios and shows significant promise for on-policy learning regimes where its stability and efficiency advantages can be particularly impactful.

Abstract (translated)

我们介绍了AlphaGrad,这是一种内存高效且条件无状态的优化器,旨在解决类似Adam这样的自适应方法所带来的内存开销和超参数复杂性问题。AlphaGrad通过逐张量L2梯度归一化后进行平滑双曲正切变换 g=tanh(α˜g) 来强制执行尺度不变性,该变换由单一陡峭程度参数 α 控制。 我们的贡献包括: 1. AlphaGrad算法的公式推导; 2. 非凸收敛分析的形式化证明,保证了稳定性的达成; 3. 在多种强化学习基准测试(DQN、TD3、PPO)上进行了详尽的经验评估。 与Adam相比,AlphaGrad展示了高度依赖于上下文的表现特征。虽然在基于策略外的DQN中表现出不稳定,但它为TD3提供了增强的训练稳定性,并取得了具有竞争力的结果(需要仔细调整 α 参数),并在基于策略内的PPO中实现了显著优越的性能表现。这些结果突显了经验性选择 α 的重要性,并揭示了优化器动力学与底层强化学习算法之间的强相互作用。 AlphaGrad为内存受限的情景提供了具有竞争力的选择,并且对于其稳定性和效率优势可以特别发挥作用的基于策略的学习场景显示出了显著潜力。

URL

https://arxiv.org/abs/2504.16020

PDF

https://arxiv.org/pdf/2504.16020.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot