Generating Probabilistic Scenario Programs from Natural Language

Abstract
Abstract (translated)
URL
PDF

Abstract

For cyber-physical systems (CPS), including robotics and autonomous vehicles, mass deployment has been hindered by fatal errors that occur when operating in rare events. To replicate rare events such as vehicle crashes, many companies have created logging systems and employed crash reconstruction experts to meticulously recreate these valuable events in simulation. However, in these methods, "what if" questions are not easily formulated and answered. We present ScenarioNL, an AI System for creating scenario programs from natural language. Specifically, we generate these programs from police crash reports. Reports normally contain uncertainty about the exact details of the incidents which we represent through a Probabilistic Programming Language (PPL), Scenic. By using Scenic, we can clearly and concisely represent uncertainty and variation over CPS behaviors, properties, and interactions. We demonstrate how commonplace prompting techniques with the best Large Language Models (LLM) are incapable of reasoning about probabilistic scenario programs and generating code for low-resource languages such as Scenic. Our system is comprised of several LLMs chained together with several kinds of prompting strategies, a compiler, and a simulator. We evaluate our system on publicly available autonomous vehicle crash reports in California from the last five years and share insights into how we generate code that is both semantically meaningful and syntactically correct.

Abstract (translated)

对于计算机物理系统（CPS），包括机器人学和自动驾驶车辆，由于在罕见事件操作中发生致命错误，导致大量部署受到阻碍。为了复制像车辆碰撞这样的稀有事件，许多公司创建了日志系统并雇佣了碰撞重建专家，精心复制这些有价值的事件在仿真中。然而，在这些方法中， "如果" 问题不容易制定和回答。我们提出了ScenarioNL，一种用自然语言创建场景程序的人工智能系统。具体来说，我们从警察碰撞报告开始生成这些程序。报告通常包含关于事件详细信息的不确定性，我们通过概率编程语言（PPL，Scenic）来表示这些不确定性。通过使用Scenic，我们可以清晰地表示CPS的行为、属性和相互作用的随机性。我们证明了最好的大规模语言模型（LLM）的常见提示技术无法推理关于概率情景程序，并为低资源语言（如Scenic）生成代码。我们的系统由几个LLM链、几种提示策略、编译器和模拟器组成。我们在过去五年内从加利福尼亚公开发布的自动驾驶车辆碰撞报告中评估我们的系统，并分享了如何生成既语义有意义又语法正确的代码。

URL

https://arxiv.org/abs/2405.03709

PDF

https://arxiv.org/pdf/2405.03709.pdf

Generating Probabilistic Scenario Programs from Natural Language

Abstract

Abstract (translated)

URL

PDF Copy

PDF