Paper Reading AI Learner

A Hybrid Multi-Well Hopfield-CNN with Feature Extraction and K-Means for MNIST Classification


Abstract

This study presents a hybrid model for classifying handwritten digits in the MNIST dataset, combining convolutional neural networks (CNNs) with a multi-well Hopfield network. The approach employs a CNN to extract high-dimensional features from input images, which are then clustered into class-specific prototypes using k-means clustering. These prototypes serve as attractors in a multi-well energy landscape, where a Hopfield network performs classification by minimizing an energy function that balances feature similarity and class this http URL model's design enables robust handling of intraclass variability, such as diverse handwriting styles, while providing an interpretable framework through its energy-based decision process. Through systematic optimization of the CNN architecture and the number of wells, the model achieves a high test accuracy of 99.2% on 10,000 MNIST images, demonstrating its effectiveness for image classification tasks. The findings highlight the critical role of deep feature extraction and sufficient prototype coverage in achieving high performance, with potential for broader applications in pattern recognition.

Abstract (translated)

这项研究提出了一种用于在MNIST数据集上分类手写数字的混合模型,该模型结合了卷积神经网络(CNN)与多阱霍普菲尔德网络。这种方法利用CNN从输入图像中提取高维特征,并使用k均值聚类将这些特征聚类为特定于每个类别的原型。这些原型作为具有多个能量陷阱的能量景观中的吸引子,在其中霍普菲尔德网络通过最小化一个平衡了特征相似性和类别归属的能函数来进行分类。该模型的设计能够稳健地处理同一类别内的变化,例如多样的手写风格,并且由于其基于能量的决策过程提供了可解释性框架。 通过对CNN架构和阱的数量进行系统优化,该模型在10,000张MNIST图像上达到了99.2%的高测试准确率,证明了它对于图像分类任务的有效性。研究结果强调了深度特征提取以及充分原型覆盖对于实现高性能的关键作用,并且其潜在的应用范围可能更广泛,在模式识别领域具有潜力。

URL

https://arxiv.org/abs/2507.08766

PDF

https://arxiv.org/pdf/2507.08766.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot