Paper Reading AI Learner

Progressive Feedforward Collapse of ResNet Training

2024-05-02 03:48:08
Sicong Wang, Kuo Gai, Shihua Zhang

Abstract

Neural collapse (NC) is a simple and symmetric phenomenon for deep neural networks (DNNs) at the terminal phase of training, where the last-layer features collapse to their class means and form a simplex equiangular tight frame aligning with the classifier vectors. However, the relationship of the last-layer features to the data and intermediate layers during training remains unexplored. To this end, we characterize the geometry of intermediate layers of ResNet and propose a novel conjecture, progressive feedforward collapse (PFC), claiming the degree of collapse increases during the forward propagation of DNNs. We derive a transparent model for the well-trained ResNet according to that ResNet with weight decay approximates the geodesic curve in Wasserstein space at the terminal phase. The metrics of PFC indeed monotonically decrease across depth on various datasets. We propose a new surrogate model, multilayer unconstrained feature model (MUFM), connecting intermediate layers by an optimal transport regularizer. The optimal solution of MUFM is inconsistent with NC but is more concentrated relative to the input data. Overall, this study extends NC to PFC to model the collapse phenomenon of intermediate layers and its dependence on the input data, shedding light on the theoretical understanding of ResNet in classification problems.

Abstract (translated)

神经崩溃(NC)是深度神经网络(DNNs)在训练的末端阶段的一种简单对称现象,其中最后一层的特征崩溃到其类别意味着并形成了一个类ifier向量之间的等距 tight frame。然而,在训练过程中最后一层特征与数据和中间层之间的关系仍然没有被探索。为此,我们研究了 ResNet 的中间层几何,并提出了一个新的假设,称为 progressive feedforward collapse(PFC),声称在 DNNs 的前向传播过程中崩溃程度会增加。根据那个在 ResNet 中,权重衰减逼近在 Wasserstein 空间中的测地线的模型,我们得到了一个透明的模型。PFC 在各种数据集上的深度确实单调递减。我们提出了一个新代理模型,多层约束特征模型(MUFM),通过最优传输 regularizer 连接中间层。MUFM 的最优解与 NC 不一致,但相对于输入数据更加集中。总的来说,这项研究将 NC 扩展到 PFC,以建模中间层的崩溃现象及其与输入数据的关系,为 ResNet 在分类问题中的理论理解提供了更深入的认识。

URL

https://arxiv.org/abs/2405.00985

PDF

https://arxiv.org/pdf/2405.00985.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot