Paper Reading AI Learner

PACER+: On-Demand Pedestrian Animation Controller in Driving Scenarios

2024-04-30 17:15:42
Jingbo Wang, Zhengyi Luo, Ye Yuan, Yixuan Li, Bo Dai

Abstract

We address the challenge of content diversity and controllability in pedestrian simulation for driving scenarios. Recent pedestrian animation frameworks have a significant limitation wherein they primarily focus on either following trajectory [46] or the content of the reference video [57], consequently overlooking the potential diversity of human motion within such scenarios. This limitation restricts the ability to generate pedestrian behaviors that exhibit a wider range of variations and realistic motions and therefore restricts its usage to provide rich motion content for other components in the driving simulation system, e.g., suddenly changed motion to which the autonomous vehicle should respond. In our approach, we strive to surpass the limitation by showcasing diverse human motions obtained from various sources, such as generated human motions, in addition to following the given trajectory. The fundamental contribution of our framework lies in combining the motion tracking task with trajectory following, which enables the tracking of specific motion parts (e.g., upper body) while simultaneously following the given trajectory by a single policy. This way, we significantly enhance both the diversity of simulated human motion within the given scenario and the controllability of the content, including language-based control. Our framework facilitates the generation of a wide range of human motions, contributing to greater realism and adaptability in pedestrian simulations for driving scenarios. More information is on our project page this https URL .

Abstract (translated)

我们在行人仿真中面临内容多样性和可控制性的挑战。最近的人行动画框架有一个显著的局限性,即它们主要关注于跟踪轨迹[46]或参考视频[57],从而忽视了场景中人类运动多样性的潜在可能性。这个局限性限制了能够生成具有更广泛变化和真实感的行人行为的能力,因此限制了它在驾驶仿真系统中的使用,例如突然改变的运动,这是自动驾驶应该响应的。在我们的方法中,我们力求超越这一限制,通过展示从各种来源获得的多样人类运动,包括生成的行人运动,来超越这一限制。我们框架的基本贡献在于将运动跟踪任务与轨迹跟踪相结合,使得在单策略下同时跟踪给定的轨迹和特定运动部分(例如上半身)的同时,能够跟踪给定场景中的人类运动多样性。这样,我们显著增强了给定场景中仿真人类运动的多样性,并提高了内容的可控性,包括基于语言的控制。我们的框架有助于生成一系列行人运动,从而提高驾驶场景中行人仿真的真实性和适应性。更多相关信息,请访问我们的项目页面,链接如下:https://www.example.com/project 。

URL

https://arxiv.org/abs/2404.19722

PDF

https://arxiv.org/pdf/2404.19722.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot