Paper Reading AI Learner

PDE+: Enhancing Generalization via PDE with Adaptive Distributional Diffusion

2023-05-25 08:23:26
Yige Yuan, Bingbing Xu, Bo Lin, Liang Hou, Fei Sun, Huawei Shen, Xueqi Cheng

Abstract

The generalization of neural networks is a central challenge in machine learning, especially concerning the performance under distributions that differ from training ones. Current methods, mainly based on the data-driven paradigm such as data augmentation, adversarial training, and noise injection, may encounter limited generalization due to model non-smoothness. In this paper, we propose to investigate generalization from a Partial Differential Equation (PDE) perspective, aiming to enhance it directly through the underlying function of neural networks, rather than focusing on adjusting input data. Specifically, we first establish the connection between neural network generalization and the smoothness of the solution to a specific PDE, namely ``transport equation''. Building upon this, we propose a general framework that introduces adaptive distributional diffusion into transport equation to enhance the smoothness of its solution, thereby improving generalization. In the context of neural networks, we put this theoretical framework into practice as PDE+ (\textbf{PDE} with \textbf{A}daptive \textbf{D}istributional \textbf{D}iffusion) which diffuses each sample into a distribution covering semantically similar inputs. This enables better coverage of potentially unobserved distributions in training, thus improving generalization beyond merely data-driven methods. The effectiveness of PDE+ is validated in extensive settings, including clean samples and various corruptions, demonstrating its superior performance compared to SOTA methods.

Abstract (translated)

神经网络的泛化是机器学习中的核心挑战,特别是关于训练数据和分布之间的性能。目前的方法,主要是基于数据驱动范式,例如数据增强、对抗训练和噪声注入,可能会因为模型的不平滑而遇到有限的泛化能力。在本文中,我们提议从偏微分方程(PDE)的角度研究泛化问题,旨在通过神经网络的基函数增强其 underlying 函数的平滑性,而不是仅仅关注调整输入数据。具体而言,我们首先建立了神经网络泛化与特定PDE解决方案平滑性的联系,即“传输方程”。基于这一点,我们提出了一个通用框架,将自适应分布扩散引入传输方程,以增强其解决方案的平滑性,从而改善泛化能力。在神经网络的背景下,我们将这个理论框架应用于实践,将其称为PDE+,(PDE with Adaptive Distributional Diffusion),将每个样本扩散到覆盖语义上相似的输入的分布中。这使能够在训练过程中更好地覆盖可能存在未观测到的分布,从而超越了仅仅基于数据驱动方法的泛化能力。PDE+的效果在广泛的设置中得到了验证,包括干净样本和各种欺诈,证明了它与SOTA方法相比的优越性能。

URL

https://arxiv.org/abs/2305.15835

PDF

https://arxiv.org/pdf/2305.15835.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot