Paper Reading AI Learner

Safe POMDP Online Planning among Dynamic Agents via Adaptive Conformal Prediction

2024-04-23 23:11:42
Shili Sheng, Pian Yu, David Parker, Marta Kwiatkowska, Lu Feng

Abstract

Online planning for partially observable Markov decision processes (POMDPs) provides efficient techniques for robot decision-making under uncertainty. However, existing methods fall short of preventing safety violations in dynamic environments. This work presents a novel safe POMDP online planning approach that offers probabilistic safety guarantees amidst environments populated by multiple dynamic agents. Our approach utilizes data-driven trajectory prediction models of dynamic agents and applies Adaptive Conformal Prediction (ACP) for assessing the uncertainties in these predictions. Leveraging the obtained ACP-based trajectory predictions, our approach constructs safety shields on-the-fly to prevent unsafe actions within POMDP online planning. Through experimental evaluation in various dynamic environments using real-world pedestrian trajectory data, the proposed approach has been shown to effectively maintain probabilistic safety guarantees while accommodating up to hundreds of dynamic agents.

Abstract (translated)

基于部分可观测的马尔可夫决策过程(POMDP)的在线规划为机器人在不确定性环境中的决策提供了有效的技术。然而,现有的方法尚不能在动态环境中防止安全违规。本文提出了一种新的安全POMDP在线规划方法,能在充满多个动态代理人的环境中提供概率安全保证。我们的方法利用动态代理数据的基于数据驱动的轨迹预测模型,并应用自适应收缩预测(ACP)来评估这些预测的不确定性。通过使用获得的ACP基于轨迹预测,我们的方法在在线规划过程中动态地构建安全屏蔽以防止不安全行为。通过使用真实世界行人轨迹数据在各种动态环境中进行实验评估,该方法已被证明在容纳多达数百个动态代理人的情况下,有效保持概率安全保证。

URL

https://arxiv.org/abs/2404.15557

PDF

https://arxiv.org/pdf/2404.15557.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot