Abstract
In the quest towards general artificial intelligence (AI), researchers have explored developing loss functions that act as intrinsic motivators in the absence of external rewards. This paper argues that such research has overlooked an important and useful intrinsic motivator: social interaction. We posit that making an AI agent aware of implicit social feedback from humans can allow for faster learning of more generalizable and useful representations, and could potentially impact AI safety. We collect social feedback in the form of facial expression reactions to samples from Sketch RNN, an LSTM-based variational autoencoder (VAE) designed to produce sketch drawings. We use a Latent Constraints GAN (LC-GAN) to learn from the facial feedback of a small group of viewers, by optimizing the model to produce sketches that it predicts will lead to more positive facial expressions. We show in multiple independent evaluations that the model trained with facial feedback produced sketches that are more highly rated, and induce significantly more positive facial expressions. Thus, we establish that implicit social feedback can improve the output of a deep learning model.
Abstract (translated)
在寻求一般人工智能(AI)的过程中,研究人员探索了在没有外部奖励的情况下发展作为内在激励因素的损失函数。本文认为,这种研究忽略了一个重要且有用的内在动力:社会互动。我们认为让AI代理了解人类隐含的社会反馈可以更快地学习更具概括性和有用的表示,并可能影响AI的安全性。我们收集来自Sketch RNN样本的面部表情反应形式的社会反馈,这是一种基于LSTM的变分自动编码器(VAE),用于制作草图。我们使用潜在约束GAN(LC-GAN)来从一小群观众的面部反馈中学习,通过优化模型来产生预测将导致更积极的面部表情的草图。我们在多个独立评估中表明,通过面部反馈训练的模型产生了更高评价的草图,并且诱导显着更积极的面部表情。因此,我们确定隐性社会反馈可以提高深度学习模型的输出。
URL
https://arxiv.org/abs/1802.04877