Abstract
Foundation models have had a big impact in recent years and billions of dollars are being invested in them in the current AI boom. The more popular ones, such as Chat-GPT, are trained on large amounts of data from the Internet, and then reinforcement learning, RAG, prompt engineering and cognitive modelling are used to fine-tune and augment their behavior. This technology has been used to create models of individual people, such as Caryn Marjorie. However, these chatbots are not based on people's actual emotional and physiological responses to their environment, so they are, at best, surface-level approximations to the characters they are imitating. This paper describes how a new type of foundation model - a first-person foundation model - could be created from recordings of what a person sees and hears as well as their emotional and physiological reactions to these stimuli. A first-person foundation model would map environmental stimuli to a person's emotional and physiological states, and map a person's emotional and physiological states to their behavior. First-person foundation models have many exciting applications, including a new type of recommendation engine, personal assistants, generative adversarial networks, dating and recruitment. To obtain training data for a first-person foundation model, we have developed a recording rig that captures what the wearer is seeing and hearing as well as their emotional and physiological states. This novel source of data could help to address the shortage of new data for building the next generation of foundation models.
Abstract (translated)
近年来,基础模型在人工智能繁荣时期取得了重大影响,数十亿美元的投资投入其中。其中,像 Chat-GPT 这样的流行模型是通过对互联网大量数据进行训练,然后使用强化学习、RAG、提示工程和认知建模等方法对其行为进行微调和增强。这项技术已经用于创建一个人的模型,例如 Caryn Marjorie。然而,这些聊天机器人并不是基于人们对外界环境的实际情感和生理反应,所以它们顶多只是表面级的模拟,模拟的对象。本文描述了一种新类型的基础模型——第一人称基础模型——如何从一个人所见所闻以及对外界刺激的情感和生理反应的录音中创建出来。第一人称基础模型将环境刺激映射到一个人的情感和生理状态,将一个人的情感和生理状态映射到其行为。第一人称基础模型具有许多令人兴奋的应用,包括一种新的推荐引擎、个人助手、生成对抗网络、约会和招聘等。为了获取第一人称基础模型的训练数据,我们开发了一种记录 rig,它可以捕捉佩戴者所看到的和听到的内容以及他们的情感和生理状态。这种新的数据源有助于解决为构建下一代基础模型缺乏新数据的问题。
URL
https://arxiv.org/abs/2408.00030