Paper Reading AI Learner

Neural Architecture Search using Particle Swarm and Ant Colony Optimization

2024-03-06 15:23:26
Séamus Lankford, Diarmuid Grimes

Abstract

Neural network models have a number of hyperparameters that must be chosen along with their architecture. This can be a heavy burden on a novice user, choosing which architecture and what values to assign to parameters. In most cases, default hyperparameters and architectures are used. Significant improvements to model accuracy can be achieved through the evaluation of multiple architectures. A process known as Neural Architecture Search (NAS) may be applied to automatically evaluate a large number of such architectures. A system integrating open source tools for Neural Architecture Search (OpenNAS), in the classification of images, has been developed as part of this research. OpenNAS takes any dataset of grayscale, or RBG images, and generates Convolutional Neural Network (CNN) architectures based on a range of metaheuristics using either an AutoKeras, a transfer learning or a Swarm Intelligence (SI) approach. Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO) are used as the SI algorithms. Furthermore, models developed through such metaheuristics may be combined using stacking ensembles. In the context of this paper, we focus on training and optimizing CNNs using the Swarm Intelligence (SI) components of OpenNAS. Two major types of SI algorithms, namely PSO and ACO, are compared to see which is more effective in generating higher model accuracies. It is shown, with our experimental design, that the PSO algorithm performs better than ACO. The performance improvement of PSO is most notable with a more complex dataset. As a baseline, the performance of fine-tuned pre-trained models is also evaluated.

Abstract (translated)

神经网络模型有许多超参数需要选择,并与其架构一起选择。这可能对新手用户来说是一个沉重的负担,因为他们必须选择架构和为参数分配值。在大多数情况下,使用默认的超参数和架构即可。通过评估多种架构,可以实现模型的显著提高准确性。一种名为神经架构搜索(NAS)的过程可用于自动评估大量这样的架构。 在这项研究中,还开发了一个集成开源工具进行神经架构搜索(OpenNAS)的系统,用于对图像进行分类。OpenNAS基于一系列元启发式方法生成灰度或RGB图像的卷积神经网络(CNN)架构。PSO和Ant Colony Optimization(ACO)作为SI算法使用。此外,通过这样的元启发式方法开发的模型可以使用堆叠增强集进行组合。 在本文中,我们关注使用OpenNAS中的Swarm Intelligence(SI)组件训练和优化CNN。本文比较了PSO和ACO这两种SI算法,以确定哪种算法在生成较高模型准确性方面更有效。实验结果表明,在我们的实验设计中,PSO算法表现更好。PSO的性能改善最为明显,尤其是在具有更复杂数据集的情况下。作为 baseline,还评估了预训练模型的性能。

URL

https://arxiv.org/abs/2403.03781

PDF

https://arxiv.org/pdf/2403.03781.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot