Paper Reading AI Learner

Integral Signatures of Activation Functions: A 9-Dimensional Taxonomy and Stability Theory for Deep Learning

2025-10-09 17:03:00
Ankur Mali, Lawrence Hall, Jake Williams, Gordon Richards

Abstract

Activation functions govern the expressivity and stability of neural networks, yet existing comparisons remain largely heuristic. We propose a rigorous framework for their classification via a nine-dimensional integral signature S_sigma(phi), combining Gaussian propagation statistics (m1, g1, g2, m2, eta), asymptotic slopes (alpha_plus, alpha_minus), and regularity measures (TV(phi'), C(phi)). This taxonomy establishes well-posedness, affine reparameterization laws with bias, and closure under bounded slope variation. Dynamical analysis yields Lyapunov theorems with explicit descent constants and identifies variance stability regions through (m2', g2). From a kernel perspective, we derive dimension-free Hessian bounds and connect smoothness to bounded variation of phi'. Applying the framework, we classify eight standard activations (ReLU, leaky-ReLU, tanh, sigmoid, Swish, GELU, Mish, TeLU), proving sharp distinctions between saturating, linear-growth, and smooth families. Numerical Gauss-Hermite and Monte Carlo validation confirms theoretical predictions. Our framework provides principled design guidance, moving activation choice from trial-and-error to provable stability and kernel conditioning.

Abstract (translated)

激活函数决定了神经网络的表达能力和稳定性,然而现有的比较大多基于启发式方法。我们提出了一种通过九维积分签名S_sigma(φ)来严格分类这些函数的方法,该签名结合了高斯传播统计量(m1, g1, g2, m2, eta)、渐近斜率(alpha_plus, alpha_minus)以及平滑度指标(TV(phi'), C(phi)。这一分类体系确立了良好定义性、带有偏置的仿射再参数化规律,并且在有界斜率变化下封闭。动态分析产生了具有明确下降常数的Lyapunov定理,通过(m2', g2)识别方差稳定性区域。从核函数的角度来看,我们推导出无维度的Hessian界限,并将平滑度与phi'的有界变异相连。应用此框架,我们将八种标准激活函数(ReLU、Leaky-ReLU、tanh、sigmoid、Swish、GELU、Mish和TeLU)进行了分类,证明了饱和型、线性增长型和平滑型之间的清晰区别。数值高斯-赫尔密特及蒙特卡洛验证证实了理论预测的准确性。我们的框架为激活函数的设计提供了原则性的指导,从试错法转向可证稳定性与核条件下的优化选择。

URL

https://arxiv.org/abs/2510.08456

PDF

https://arxiv.org/pdf/2510.08456.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot