WcDT: World-centric Diffusion Transformer for Traffic Scene Generation

Abstract
Abstract (translated)
URL
PDF

Abstract

In this paper, we introduce a novel approach for autonomous driving trajectory generation by harnessing the complementary strengths of diffusion probabilistic models (a.k.a., diffusion models) and transformers. Our proposed framework, termed the "World-Centric Diffusion Transformer" (WcDT), optimizes the entire trajectory generation process, from feature extraction to model inference. To enhance the scene diversity and stochasticity, the historical trajectory data is first preprocessed and encoded into latent space using Denoising Diffusion Probabilistic Models (DDPM) enhanced with Diffusion with Transformer (DiT) blocks. Then, the latent features, historical trajectories, HD map features, and historical traffic signal information are fused with various transformer-based encoders. The encoded traffic scenes are then decoded by a trajectory decoder to generate multimodal future trajectories. Comprehensive experimental results show that the proposed approach exhibits superior performance in generating both realistic and diverse trajectories, showing its potential for integration into automatic driving simulation systems.

Abstract (translated)

在本文中，我们提出了一个新颖的方法，通过利用扩散概率模型的互补优势来设计自动驾驶轨迹生成框架。我们的方法被称为"世界中心扩散Transformer"（WcDT），在整个轨迹生成过程中优化了特征提取到模型推理。为了增强场景多样性和随机性，首先对历史轨迹数据进行预处理，并使用Denoising Diffusion Probabilistic Models（DDPM）增强的Diffusion with Transformer（DiT）块将它们编码到潜在空间中。然后，将潜在特征、历史轨迹、高程图特征和历史交通信号信息与各种Transformer基编码器进行融合。接着，编码的交通场景被轨迹解码器解码，生成多模态的未来轨迹。全面的实验结果表明，与传统的轨迹生成方法相比，所提出的方法在生成真实和多样轨迹方面表现出优异的性能，表明其潜在用于自动驾驶模拟系统。

URL

https://arxiv.org/abs/2404.02082

PDF

https://arxiv.org/pdf/2404.02082.pdf

WcDT: World-centric Diffusion Transformer for Traffic Scene Generation

Abstract

Abstract (translated)

URL

PDF Copy

PDF