Abstract
Everyday devices like light bulbs and kitchen appliances are now embedded with so many features and automated behaviors that they have become complicated to actually use. While such "smart" capabilities can better support users' goals, the task of learning the "ins and outs" of different devices is daunting. Voice assistants aim to solve this problem by providing a natural language interface to devices, yet such assistants cannot understand loosely-constrained commands, they lack the ability to reason about and explain devices' behaviors to users, and they rely on connectivity to intrusive cloud infrastructure. Toward addressing these issues, we propose thoughtful things: devices that leverage lightweight, on-device language models to take actions and explain their behaviors in response to unconstrained user commands. We propose an end-to-end framework that leverages formal modeling, automated training data synthesis, and generative language models to create devices that are both capable and thoughtful in the presence of unconstrained user goals and inquiries. Our framework requires no labeled data and can be deployed on-device, with no cloud dependency. We implement two thoughtful things (a lamp and a thermostat) and deploy them on real hardware, evaluating their practical performance.
Abstract (translated)
日常使用的设备,如灯泡和厨房电器,现在拥有了越来越多的功能和自动化行为,使得它们实际上很难使用。虽然这种“智能”功能可以更好地支持用户的目标,但学习不同设备的“内部运作”仍然具有挑战性。语音助手旨在通过提供自然语言界面来解决这个问题,然而这样的助手无法理解松散的限制性命令,它们无法解释设备的行为给用户,并且它们依赖于连接到侵入性云基础设施的通信。为解决这些问题,我们提出了各种设备的想法:这些设备利用轻量级、本地语言模型采取操作,并解释它们的行为。我们提出了一个端到端的框架,利用形式化建模、自动训练数据合成和生成语言模型来创建能够在无限制用户目标和查询中做出思考的设备。我们的框架无需标记数据,可以部署在设备上,无需依赖云基础设施。我们实现了两个设备(一盏灯和一个恒温器),并将它们部署在实际硬件上,评估它们的实际性能。
URL
https://arxiv.org/abs/2405.03821