Paper Reading AI Learner

Evaluating Reinforcement Learning Algorithms for Navigation in Simulated Robotic Quadrupeds: A Comparative Study Inspired by Guide Dog Behaviour

2025-07-17 16:38:14
Emma M. A. Harrison

Abstract

Robots are increasingly integrated across industries, particularly in healthcare. However, many valuable applications for quadrupedal robots remain overlooked. This research explores the effectiveness of three reinforcement learning algorithms in training a simulated quadruped robot for autonomous navigation and obstacle avoidance. The goal is to develop a robotic guide dog simulation capable of path following and obstacle avoidance, with long-term potential for real-world assistance to guide dogs and visually impaired individuals. It also seeks to expand research into medical 'pets', including robotic guide and alert dogs. A comparative analysis of thirteen related research papers shaped key evaluation criteria, including collision detection, pathfinding algorithms, sensor usage, robot type, and simulation platforms. The study focuses on sensor inputs, collision frequency, reward signals, and learning progression to determine which algorithm best supports robotic navigation in complex environments. Custom-made environments were used to ensure fair evaluation of all three algorithms under controlled conditions, allowing consistent data collection. Results show that Proximal Policy Optimization (PPO) outperformed Deep Q-Network (DQN) and Q-learning across all metrics, particularly in average and median steps to goal per episode. By analysing these results, this study contributes to robotic navigation, AI and medical robotics, offering insights into the feasibility of AI-driven quadruped mobility and its role in assistive robotics.

Abstract (translated)

机器人在各行各业中的应用越来越广泛,特别是在医疗保健领域。然而,许多适用于四足机器人的有价值的用途仍然被忽视了。本研究探索了三种强化学习算法在训练模拟四足机器人进行自主导航和障碍物规避方面的有效性。目标是开发一个能够进行路径跟随和障碍物规避的仿真导盲犬,该仿真人形狗具有潜在的实际应用价值,可以在未来帮助真正的导盲犬以及视障人士。此外,这项研究还旨在扩展对医疗“宠物”的研究领域,包括模拟的导盲犬和警报犬。通过对13篇相关研究论文进行比较分析,确定了关键评估标准,其中包括碰撞检测、路径规划算法、传感器使用情况、机器人类型及仿真平台等。 本研究重点关注传感器输入、碰撞频率、奖励信号以及学习进展等方面,以确定哪种算法最有助于复杂环境中的机器人导航。为了在控制条件下公平地评价所有三种算法的性能,并确保一致的数据收集,研究人员使用了自定义制作的环境。实验结果表明,在所有评估指标上,近端策略优化(Proximal Policy Optimization, PPO)均优于深度Q网络(Deep Q-Network, DQN)和Q学习,尤其是在每回合平均步数和中位步数到达目标方面表现最佳。 通过对这些结果进行分析,本研究为机器人导航、人工智能以及医疗机器人的发展做出了贡献,并提供了关于AI驱动的四足运动在辅助机器人技术中的可行性的见解。

URL

https://arxiv.org/abs/2507.13277

PDF

https://arxiv.org/pdf/2507.13277.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot