Paper Reading AI Learner

Parallel Hyperparameter Optimization Of Spiking Neural Network

2024-03-01 11:11:59
Thomas Firmin, Pierre Boulet, El-Ghazali Talbi

Abstract

Spiking Neural Networks (SNN). SNNs are based on a more biologically inspired approach than usual artificial neural networks. Such models are characterized by complex dynamics between neurons and spikes. These are very sensitive to the hyperparameters, making their optimization challenging. To tackle hyperparameter optimization of SNNs, we initially extended the signal loss issue of SNNs to what we call silent networks. These networks fail to emit enough spikes at their outputs due to mistuned hyperparameters or architecture. Generally, search spaces are heavily restrained, sometimes even discretized, to prevent the sampling of such networks. By defining an early stopping criterion detecting silent networks and by designing specific constraints, we were able to instantiate larger and more flexible search spaces. We applied a constrained Bayesian optimization technique, which was asynchronously parallelized, as the evaluation time of a SNN is highly stochastic. Large-scale experiments were carried-out on a multi-GPU Petascale architecture. By leveraging silent networks, results show an acceleration of the search, while maintaining good performances of both the optimization algorithm and the best solution obtained. We were able to apply our methodology to two popular training algorithms, known as spike timing dependent plasticity and surrogate gradient. Early detection allowed us to prevent worthless and costly computation, directing the search toward promising hyperparameter combinations. Our methodology could be applied to multi-objective problems, where the spiking activity is often minimized to reduce the energy consumption. In this scenario, it becomes essential to find the delicate frontier between low-spiking and silent networks. Finally, our approach may have implications for neural architecture search, particularly in defining suitable spiking architectures.

Abstract (translated)

尖峰神经网络(SNN)。SNN与通常的人工神经网络有所不同,其基于更生物启发的原理。这类模型的特点是神经元和尖峰之间的复杂动态。对SNN进行优化时,由于超参数设置不正确或网络架构不合适,这些网络在输出处产生的尖峰不足。一般来说,搜索空间都被严重限制,有时甚至被离散化,以防止采样这样的网络。通过定义一个早期停止准则来检测SNN并设计特定的约束,我们能够实现较大的、更灵活的搜索空间。我们对SNN应用了约束的贝叶斯优化技术,由于SNN的评估时间高度随机,因此评估时间对并行计算的依赖较大。我们在多GPU Petascale架构上进行了大规模实验。通过利用SNN,结果表明,搜索加速,同时保持优化算法和最佳解的性能。我们将我们的方法应用于两种流行的训练算法,即尖峰时序相关塑性和代理梯度。早期检测使我们能够防止无用且昂贵的计算,将搜索方向转向有前景的超参数组合。我们的方法可以应用于多目标问题,其中尖峰活动经常被最小化以降低能耗。在这种情况下,找到尖峰网络和沉默网络之间的微妙边界至关重要。最后,我们的方法对神经架构搜索可能产生影响,特别是定义合适的尖峰网络架构。

URL

https://arxiv.org/abs/2403.00450

PDF

https://arxiv.org/pdf/2403.00450.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot