Paper Reading AI Learner

Brain-like Functional Organization within Large Language Models

2024-10-25 13:15:17
H. Sun, L. Zhao, Z. Wu, X. Gao, Y. Hu, M. Zuo, W. Zhang, J. Han, T. Liu, X. Hu

Abstract

The human brain has long inspired the pursuit of artificial intelligence (AI). Recently, neuroimaging studies provide compelling evidence of alignment between the computational representation of artificial neural networks (ANNs) and the neural responses of the human brain to stimuli, suggesting that ANNs may employ brain-like information processing strategies. While such alignment has been observed across sensory modalities--visual, auditory, and linguistic--much of the focus has been on the behaviors of artificial neurons (ANs) at the population level, leaving the functional organization of individual ANs that facilitates such brain-like processes largely unexplored. In this study, we bridge this gap by directly coupling sub-groups of artificial neurons with functional brain networks (FBNs), the foundational organizational structure of the human brain. Specifically, we extract representative patterns from temporal responses of ANs in large language models (LLMs), and use them as fixed regressors to construct voxel-wise encoding models to predict brain activity recorded by functional magnetic resonance imaging (fMRI). This framework links the AN sub-groups to FBNs, enabling the delineation of brain-like functional organization within LLMs. Our findings reveal that LLMs (BERT and Llama 1-3) exhibit brain-like functional architecture, with sub-groups of artificial neurons mirroring the organizational patterns of well-established FBNs. Notably, the brain-like functional organization of LLMs evolves with the increased sophistication and capability, achieving an improved balance between the diversity of computational behaviors and the consistency of functional specializations. This research represents the first exploration of brain-like functional organization within LLMs, offering novel insights to inform the development of artificial general intelligence (AGI) with human brain principles.

Abstract (translated)

人类大脑长期以来一直激发着人工智能(AI)的研究追求。最近,神经成像研究提供了令人信服的证据,表明人工神经网络(ANNs)的计算表示与人脑对刺激的神经反应之间存在一致性,这暗示ANNS可能采用了类似于大脑的信息处理策略。尽管这种一致性的观察已经跨越了感官模态——视觉、听觉和语言——但大部分焦点都集中在人工神经元(ANs)在群体层面的行为上,使得促进这类类似大脑过程的单个人工神经元的功能组织大多未被探索。在这项研究中,我们通过直接将人工神经元子组与功能性脑网络(FBNs)耦合来弥补这一差距,这是人类大脑的基础组织结构。具体来说,我们从大型语言模型(LLMs)中人工神经元的时间响应中提取代表模式,并使用它们作为固定的回归器构建体素级编码模型以预测通过功能磁共振成像(fMRI)记录的大脑活动。这个框架将人工神经子组与功能性脑网络联系起来,使人们能够界定出存在于LLMs中的类似大脑的功能组织。我们的研究结果揭示了LLMs(BERT和Llama 1-3)表现出类似于大脑的功能结构,其中的人工神经元子组映射出了已知FBNs的组织模式。值得注意的是,随着复杂性和能力的提高,LLMs中类脑功能组织实现了计算行为多样性和功能性专业化一致性之间的更好平衡。这项研究首次探索了存在于LLMs中的类似大脑的功能组织,为基于人类大脑原理开发通用人工智能(AGI)提供了新颖见解。

URL

https://arxiv.org/abs/2410.19542

PDF

https://arxiv.org/pdf/2410.19542.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot