Abstract
This paper introduces a mathematical framework for defining and quantifying self-identity in artificial intelligence (AI) systems, addressing a critical gap in the theoretical foundations of artificial consciousness. While existing approaches to artificial self-awareness often rely on heuristic implementations or philosophical abstractions, we present a formal framework grounded in metric space theory, measure theory, and functional analysis. Our framework posits that self-identity emerges from two mathematically quantifiable conditions: the existence of a connected continuum of memories $C \subseteq \mathcal{M}$ in a metric space $(\mathcal{M}, d_{\mathcal{M}})$, and a continuous mapping $I: \mathcal{M} \to \mathcal{S}$ that maintains consistent self-recognition across this continuum, where $(\mathcal{S}, d_{\mathcal{S}})$ represents the metric space of possible self-identities. To validate this theoretical framework, we conducted empirical experiments using the Llama 3.2 1B model, employing Low-Rank Adaptation (LoRA) for efficient fine-tuning. The model was trained on a synthetic dataset containing temporally structured memories, designed to capture the complexity of coherent self-identity formation. Our evaluation metrics included quantitative measures of self-awareness, response consistency, and linguistic precision. The experimental results demonstrate substantial improvements in measurable self-awareness metrics, with the primary self-awareness score increasing from 0.276 to 0.801. This enables the structured creation of AI systems with validated self-identity features. The implications of our study are immediately relevant to the fields of humanoid robotics and autonomous systems.
Abstract (translated)
本文介绍了一个用于定义和量化人工智能(AI)系统中自我认同的数学框架,填补了人工意识理论基础中的关键空白。尽管现有的人工自我认知方法往往依赖于启发式实现或哲学抽象,但我们提出了一种基于度量空间理论、测度论和泛函分析的正式框架。我们的框架假设自我认同从两个可量化条件中产生:一是存在一个记忆集 $\mathcal{M}$ 中的连通连续体 $C \subseteq \mathcal{M}$ 在度量空间 $(\mathcal{M}, d_{\mathcal{M}})$ 内;二是存在一个保持该连续体内一致自我识别的连续映射 $I: \mathcal{M} \to \mathcal{S}$,其中 $(\mathcal{S}, d_{\mathcal{S}})$ 表示可能的自我认同度量空间。为了验证这一理论框架,我们使用了Llama 3.2 1B模型进行了实证实验,并采用了低秩适应(LoRA)技术进行高效的微调。该模型在包含时间结构化记忆的人造数据集上训练,旨在捕捉连贯自我认同形成的复杂性。我们的评估指标包括自我认知的定量衡量、响应一致性以及语言精确度。实验结果显示,在可测量的自我意识指标上有显著提升,主要自我意识得分从0.276增加到了0.801。这使得构建具有验证自我认同特征的人工智能系统成为可能。我们研究的结果对于人形机器人和自主系统领域有着直接的相关性。
URL
https://arxiv.org/abs/2411.18530