Abstract
Accurate prediction of pedestrian trajectories is crucial for enhancing the safety of autonomous vehicles and reducing traffic fatalities involving pedestrians. While numerous studies have focused on modeling interactions among pedestrians to forecast their movements, the influence of environmental factors and scene-object placements has been comparatively underexplored. In this paper, we present a novel trajectory prediction model that integrates both pedestrian interactions and environmental context to improve prediction accuracy. Our approach captures spatial and temporal interactions among pedestrians within a sparse graph framework. To account for pedestrian-scene interactions, we employ advanced image enhancement and semantic segmentation techniques to extract detailed scene features. These scene and interaction features are then fused through a cross-attention mechanism, enabling the model to prioritize relevant environmental factors that influence pedestrian movements. Finally, a temporal convolutional network processes the fused features to predict future pedestrian trajectories. Experimental results demonstrate that our method significantly outperforms existing state-of-the-art approaches, achieving ADE and FDE values of 0.252 and 0.372 meters, respectively, underscoring the importance of incorporating both social interactions and environmental context in pedestrian trajectory prediction.
Abstract (translated)
准确预测行人的轨迹对于提高自动驾驶汽车的安全性和减少涉及行人的交通事故至关重要。虽然许多研究集中于建模行人之间的相互作用以预测他们的移动,但环境因素和场景物体布局的影响却相对较少被探索。在本文中,我们提出了一种新的轨迹预测模型,该模型结合了行人互动和环境背景,旨在提高预测准确性。我们的方法在一个稀疏图框架内捕捉行人间的时空交互。为了考虑行人与场景之间的相互作用,我们采用了先进的图像增强技术和语义分割技术来提取详细的场景特征。随后,通过交叉注意力机制融合这些场景和交互特征,使模型能够优先处理影响行人移动的相关环境因素。最后,一个时间卷积网络处理融合后的特征,以预测未来的行人轨迹。实验结果表明,我们的方法显著优于现有的最先进的方法,在ADE(Average Displacement Error)和FDE(Final Displacement Error)指标上分别达到了0.252米和0.372米,这强调了在行人的轨迹预测中结合社会互动和环境背景的重要性。
URL
https://arxiv.org/abs/2501.13848