Abstract
Controllable scene generation could reduce the cost of diverse data collection substantially for autonomous driving. Prior works formulate the traffic layout generation as predictive progress, either by denoising entire sequences at once or by iteratively predicting the next frame. However, full sequence denoising hinders online reaction, while the latter's short-sighted next-frame prediction lacks precise goal-state guidance. Further, the learned model struggles to generate complex or challenging scenarios due to a large number of safe and ordinal driving behaviors from open datasets. To overcome these, we introduce Nexus, a decoupled scene generation framework that improves reactivity and goal conditioning by simulating both ordinal and challenging scenarios from fine-grained tokens with independent noise states. At the core of the decoupled pipeline is the integration of a partial noise-masking training strategy and a noise-aware schedule that ensures timely environmental updates throughout the denoising process. To complement challenging scenario generation, we collect a dataset consisting of complex corner cases. It covers 540 hours of simulated data, including high-risk interactions such as cut-in, sudden braking, and collision. Nexus achieves superior generation realism while preserving reactivity and goal orientation, with a 40% reduction in displacement error. We further demonstrate that Nexus improves closed-loop planning by 20% through data augmentation and showcase its capability in safety-critical data generation.
Abstract (translated)
可控场景生成可以大幅降低自动驾驶中多样化数据收集的成本。先前的研究将交通布局的生成视为预测性进展,要么一次去除整个序列中的噪声,要么通过迭代预测下一帧来实现。然而,一次性全序列去噪会阻碍在线反应能力,而后者仅基于下一帧的短期预测又缺乏精确的目标状态指导。此外,由于开放数据集中存在大量安全和常规驾驶行为,学习模型难以生成复杂或具有挑战性的场景。 为了克服这些问题,我们引入了Nexus框架,这是一个解耦的场景生成框架,通过模拟带有独立噪声状态的细粒度令牌,来改善反应性和目标导向性,同时可以生成正常情况及有挑战性的场景。该框架的核心在于集成部分噪声屏蔽训练策略和感知噪声的时间表安排,以确保在整个去噪过程中及时更新环境。 为了补充对具有挑战性场景的生成,我们收集了一个包含复杂边缘案例的数据集,其中包括540小时模拟数据(如切入、突然刹车和碰撞等高风险互动)。Nexus在保持反应性和目标导向的同时实现了更真实的场景生成,并将位移误差减少了40%。此外,我们还展示了通过数据增强方法来提升闭环规划的20%,并证明了其在安全关键性数据生成方面的能力。
URL
https://arxiv.org/abs/2504.10485