Imitation learning from human demonstrations is a powerful framework to teach robots new skills. However, the performance of the learned policies is bottlenecked by the quality, scale, and variety of the demonstration data. In this paper, we aim to lower the barrier to collecting large and high-quality human demonstration data by proposing GELLO, a general framework for building low-cost and intuitive teleoperation systems for robotic manipulation. Given a target robot arm, we build a GELLO controller that has the same kinematic structure as the target arm, leveraging 3D-printed parts and off-the-shelf motors. GELLO is easy to build and intuitive to use. Through an extensive user study, we show that GELLO enables more reliable and efficient demonstration collection compared to commonly used teleoperation devices in the imitation learning literature such as VR controllers and 3D spacemouses. We further demonstrate the capabilities of GELLO for performing complex bi-manual and contact-rich manipulation tasks. To make GELLO accessible to everyone, we have designed and built GELLO systems for 3 commonly used robotic arms: Franka, UR5, and xArm. All software and hardware are open-sourced and can be found on our website: this https URL.
模仿人类演示是教授机器人新技能的强大框架。然而,所学策略的性能受到演示数据质量、规模和多样性的 bottleneck。在本文中,我们旨在通过提出GELLO,一个用于构建低成本、直觉的机器人操纵系统的一般框架,降低收集大规模高质量人类演示数据的障碍。给定目标机器人手,我们构建一个GELLO控制器,其 kinematic structure与目标手相同,利用3D打印部件和公版电机。GELLO 易于构建和使用。通过广泛的用户研究,我们表明GELLO比模仿学习文献中常用的远程控制设备(如虚拟现实控制器和3D空间鼠标)更加可靠和高效地收集演示数据。我们还展示了GELLO用于执行复杂的双手动量和丰富接触操纵任务的能力。为了让更多人能够使用GELLO,我们设计和构建了用于3种常用机器人手:Franka,UR5,和xArm的GELLO系统。所有软件和硬件都是开源的,可以在我们的网站上找到:这个 https URL。
https://arxiv.org/abs/2309.13037
PyPose is an open-source library for robot learning. It combines a learning-based approach with physics-based optimization, which enables seamless end-to-end robot learning. It has been used in many tasks due to its meticulously designed application programming interface (API) and efficient implementation. From its initial launch in early 2022, PyPose has experienced significant enhancements, incorporating a wide variety of new features into its platform. To satisfy the growing demand for understanding and utilizing the library and reduce the learning curve of new users, we present the fundamental design principle of the imperative programming interface, and showcase the flexible usage of diverse functionalities and modules using an extremely simple Dubins car example. We also demonstrate that the PyPose can be easily used to navigate a real quadruped robot with a few lines of code.
PyPose是一个开源机器人学习库,它结合了基于学习的方法与基于物理学的优化方法,实现了无缝的机器人全过程中的学习。由于其精心设计的应用编程接口(API)和高效的实现,PyPose在多个任务中被广泛应用。自2022年初首次推出以来,PyPose经历了显著的改进,将其平台包含了一系列丰富的新特性。为了满足不断增长的理解和利用库的需求,并降低新用户的学习曲线,我们提出了 imperative编程接口的基本设计原则,并通过一个简单的Dubins汽车例子展示了各种功能和模块的灵活使用。我们还证明了PyPose可以轻松地用于导航一个真实的四足机器人,只需要几行代码。
https://arxiv.org/abs/2309.13035
In recent years, Artificial Neural Networks (ANN) have become a standard in robotic control. However, a significant drawback of large-scale ANNs is their increased power consumption. This becomes a critical concern when designing autonomous aerial vehicles, given the stringent constraints on power and weight. Especially in the case of blimps, known for their extended endurance, power-efficient control methods are essential. Spiking neural networks (SNN) can provide a solution, facilitating energy-efficient and asynchronous event-driven processing. In this paper, we have evolved SNNs for accurate altitude control of a non-neutrally buoyant indoor blimp, relying solely on onboard sensing and processing power. The blimp's altitude tracking performance significantly improved compared to prior research, showing reduced oscillations and a minimal steady-state error. The parameters of the SNNs were optimized via an evolutionary algorithm, using a Proportional-Derivative-Integral (PID) controller as the target signal. We developed two complementary SNN controllers while examining various hidden layer structures. The first controller responds swiftly to control errors, mitigating overshooting and oscillations, while the second minimizes steady-state errors due to non-neutral buoyancy-induced drift. Despite the blimp's drivetrain limitations, our SNN controllers ensured stable altitude control, employing only 160 spiking neurons.
近年来,人工神经网络(ANN)已成为机器人控制的标准。然而,大规模ANN的一个严重缺点是其增加的功耗。考虑到功率和重量的严格限制,在设计自主飞行飞行器时,这个问题变得至关重要。特别是考虑到风筝这种以其持久的耐力著称的飞行器,高效的控制方法是至关重要的。脉冲神经网络(SNN)可以提供解决方案,以促进高效的、异步的事件驱动处理。在本文中,我们演化了SNNs,以精确控制一个非中性浮力室内风筝的海拔,仅依靠体内的感知和处理能力。风筝的海拔跟踪性能相比先前的研究显著提高,减少了振荡,并最小化了稳定的误差。SNNs的参数通过进化算法进行了优化,使用比例-积分(PID)控制器作为目标信号。在检查各种隐藏层结构的同时,我们开发了两个互补的SNN控制器。第一个控制器迅速响应控制错误,减轻过度延伸和振荡,而第二个控制器由于非中性浮力引起的漂移最小化了稳定的误差。尽管风筝的动力系统限制,我们的SNN控制器确保了稳定的海拔控制,仅使用了160个脉冲神经元。
https://arxiv.org/abs/2309.12937
This paper addresses the problem of safety-critical control of autonomous robots, considering the ubiquitous uncertainties arising from unmodeled dynamics and noisy sensors. To take into account these uncertainties, probabilistic state estimators are often deployed to obtain a belief over possible states. Namely, Particle Filters (PFs) can handle arbitrary non-Gaussian distributions in the robot's state. In this work, we define the belief state and belief dynamics for continuous-discrete PFs and construct safe sets in the underlying belief space. We design a controller that provably keeps the robot's belief state within this safe set. As a result, we ensure that the risk of the unknown robot's state violating a safety specification, such as avoiding a dangerous area, is bounded. We provide an open-source implementation as a ROS2 package and evaluate the solution in simulations and hardware experiments involving high-dimensional belief spaces.
本文解决了自主机器人安全控制的问题,考虑了由未建模动力学和噪声传感器带来的无处不在的不确定性。为了考虑这些不确定性,常常使用概率状态估计器来获取对可能状态的信仰。例如,粒子滤波器(PFs)可以在机器人状态中处理任意非高斯分布。在本文中,我们定义了连续离散PF的信仰状态和信仰动态,并在信仰空间 underlying belief space 中构建安全集。我们设计了一个控制器,可以证明将该机器人的信仰状态保持在安全集内。因此,我们确保未知机器人状态违反了安全规格,例如避免危险区域的风险是有限的。我们提供了一份开源实现,作为ROS2包,并在涉及高维信仰空间的模拟和硬件实验中评估了解决方案。
https://arxiv.org/abs/2309.12857
The robotic handling of compliant and deformable food raw materials, characterized by high biological variation, complex geometrical 3D shapes, and mechanical structures and texture, is currently in huge demand in the ocean space, agricultural, and food industries. Many tasks in these industries are performed manually by human operators who, due to the laborious and tedious nature of their tasks, exhibit high variability in execution, with variable outcomes. The introduction of robotic automation for most complex processing tasks has been challenging due to current robot learning policies. A more consistent learning policy involving skilled operators is desired. In this paper, we address the problem of robot learning when presented with inconsistent demonstrations. To this end, we propose a robust learning policy based on Learning from Demonstration (LfD) for robotic grasping of food compliant objects. The approach uses a merging of RGB-D images and tactile data in order to estimate the necessary pose of the gripper, gripper finger configuration and forces exerted on the object in order to achieve effective robot handling. During LfD training, the gripper pose, finger configurations and tactile values for the fingers, as well as RGB-D images are saved. We present an LfD learning policy that automatically removes inconsistent demonstrations, and estimates the teacher's intended policy. The performance of our approach is validated and demonstrated for fragile and compliant food objects with complex 3D shapes. The proposed approach has a vast range of potential applications in the aforementioned industry sectors.
机器人处理具有柔软和可弯曲性质的食品原材料,其特征在于具有高度生物变异性、复杂几何形状的3D形状以及机械结构和纹理,目前在全球海洋空间、农业和食品行业中非常受欢迎。在这些行业中,许多任务是由人类操作员手动完成的,由于它们的任务繁琐而劳累,表现出很高的执行变异性,结果也不确定。由于当前的机器人学习政策,对于最复杂的处理任务来说,机器人自动化的挑战性很大。我们希望实现一种涉及技能 operators 更加一致的学习政策。为此,我们提出了一种基于学习从演示(LfD)的 robust 学习政策,以机器人抓取食品柔软物体为例。该方法使用RGB-D图像和触觉数据的融合来估计所需的爪的摆位、爪指配置和对物体施加的力量,以实现有效的机器人抓取。在LfD训练期间,爪的摆位、指的配置和触觉值以及RGB-D图像都被保存。我们提出了一种 LfD 学习政策,能够自动去除不一致的演示,并估计老师的意图。该方法的性能对具有复杂3D形状、柔软食品对象的实验进行了验证和演示。该方法在上述行业领域中具有广泛的应用潜力。
https://arxiv.org/abs/2309.12856
We address the challenge of enhancing navigation autonomy for planetary space rovers using reinforcement learning (RL). The ambition of future space missions necessitates advanced autonomous navigation capabilities for rovers to meet mission objectives. RL's potential in robotic autonomy is evident, but its reliance on simulations poses a challenge. Transferring policies to real-world scenarios often encounters the "reality gap", disrupting the transition from virtual to physical environments. The reality gap is exacerbated in the context of mapless navigation on Mars and Moon-like terrains, where unpredictable terrains and environmental factors play a significant role. Effective navigation requires a method attuned to these complexities and real-world data noise. We introduce a novel two-stage RL approach using offline noisy data. Our approach employs a teacher-student policy learning paradigm, inspired by the "learning by cheating" method. The teacher policy is trained in simulation. Subsequently, the student policy is trained on noisy data, aiming to mimic the teacher's behaviors while being more robust to real-world uncertainties. Our policies are transferred to a custom-designed rover for real-world testing. Comparative analyses between the teacher and student policies reveal that our approach offers improved behavioral performance, heightened noise resilience, and more effective sim-to-real transfer.
我们解决了利用强化学习(RL)提高行星空间机器人自主导航的挑战。未来太空任务的雄心需要机器人满足任务目标,因此需要提高机器人的自主导航能力。RL在机器人自主导航方面的潜力是显而易见的,但依赖仿真面临挑战。将策略应用到现实世界场景时经常遇到“现实差距”,破坏了从虚拟到物理环境的转型。在火星和类似月球的地形上,地形和环境因素的作用至关重要,现实差距更加严重。有效的导航需要适应这些复杂性和现实世界数据噪声的方法。我们介绍了一种使用离线噪声数据的新两阶段RL方法。我们采用学生老师的政策学习范式,受到“通过欺骗学习”方法的启发。老师政策在仿真中训练,随后,学生政策在噪声数据上训练,旨在模仿老师的行为,同时更加鲁棒地应对现实世界的不确定性。我们的政策被转移到一个定制的机器人进行现实世界测试。教师和学生政策之间的比较分析表明,我们的方法提供了改进的行为表现、加强噪声恢复力,以及更有效的模拟到现实的转移。
https://arxiv.org/abs/2309.12807
We present CloudGripper, an open source cloud robotics testbed, consisting of a scalable, space and cost-efficient design constructed as a rack of 32 small robot arm work cells. Each robot work cell is fully enclosed and features individual lighting, a low-cost custom 5 degree of freedom Cartesian robot arm with an attached parallel jaw gripper and a dual camera setup for experimentation. The system design is focused on continuous operation and features a 10 Gbit/s network connectivity allowing for high throughput remote-controlled experimentation and data collection for robotic manipulation. CloudGripper furthermore is intended to form a community testbed to study the challenges of large scale machine learning and cloud and edge-computing in the context of robotic manipulation. In this work, we describe the mechanical design of the system, its initial software stack and evaluate the repeatability of motions executed by the proposed robot arm design. A local network API throughput and latency analysis is also provided. CloudGripper-Rope-100, a dataset of more than a hundred hours of randomized rope pushing interactions and approximately 4 million camera images is collected and serves as a proof of concept demonstrating data collection capabilities. A project website with more information is available at this https URL.
我们介绍了CloudGripper,一个开源的云机器人测试平台,由一个由32个小型机器人臂工作单元组成的架子组成。每个机器人工作单元都完全封闭,并配备了 individual lighting、一个低成本的定制5自由度Cartesian机器人臂,带有附加平行爪的试验装置,以及一个双摄像头的实验装置。系统设计注重连续运行,并配备了10 Gbit/s的网络连接,可以实现高吞吐量远程控制实验和机器人操纵数据的收集。CloudGripper还旨在形成一个社区测试平台,以研究大规模机器学习和云和边缘计算在机器人操纵领域的挑战。在这个工作中,我们描述了系统机械设计、其初始软件栈,并评估了 proposed 机器人臂设计的重复性行为。还提供了本地网络API吞吐量和延迟分析。CloudGripper-rope-100,一个超过100小时的随机绳推互动数据集和大约400万相机图像,用于演示数据采集能力,该数据集在此httpsURL上可用。更多信息可在该网站上查找。
https://arxiv.org/abs/2309.12786
Robot multimodal locomotion encompasses the ability to transition between walking and flying, representing a significant challenge in robotics. This work presents an approach that enables automatic smooth transitions between legged and aerial locomotion. Leveraging the concept of Adversarial Motion Priors, our method allows the robot to imitate motion datasets and accomplish the desired task without the need for complex reward functions. The robot learns walking patterns from human-like gaits and aerial locomotion patterns from motions obtained using trajectory optimization. Through this process, the robot adapts the locomotion scheme based on environmental feedback using reinforcement learning, with the spontaneous emergence of mode-switching behavior. The results highlight the potential for achieving multimodal locomotion in aerial humanoid robotics through automatic control of walking and flying modes, paving the way for applications in diverse domains such as search and rescue, surveillance, and exploration missions. This research contributes to advancing the capabilities of aerial humanoid robots in terms of versatile locomotion in various environments.
机器人的多模式行走涵盖了步行和飞行之间的平滑过渡,代表了机器人领域的一个重大挑战。这项工作提出了一种方法,可以使机器人实现自动平滑过渡,即从步行到飞行的转型。利用对抗运动先验的概念,我们的算法使机器人能够模仿运动数据集,并完成所需的任务,而不需要复杂的奖励函数。机器人从人类步态学习步行模式,从通过路径优化获得的飞行模式中学习空中行走模式。通过这个过程,机器人使用强化学习环境反馈来适应步行和飞行模式,并出现了模式切换行为。结果突出了通过自动控制步行和飞行模式实现多模式行走的潜力,为各种应用领域(如搜索和救援、监视和探索)提供了应用前景。这项工作为空中型人类机器人在各种环境中的多功能行走提供了扩展能力。
https://arxiv.org/abs/2309.12784
The operational environments in which a mobile robot executes its missions often exhibit non-flat terrain characteristics, encompassing outdoor and indoor settings featuring ramps and slopes. In such scenarios, the conventional methodologies employed for localization encounter novel challenges and limitations. This study delineates a localization framework incorporating ground elevation and inclination considerations, deviating from traditional 2D localization paradigms that may falter in such contexts. In our proposed approach, the map encompasses elevation and spatial occupancy information, employing Gridmaps and Octomaps. At the same time, the perception model is designed to accommodate the robot's inclined orientation and the potential presence of ground as an obstacle, besides usual structural and dynamic obstacles. We have developed and rigorously validated our approach within Nav2, and esteemed open-source framework renowned for robot navigation. Our findings demonstrate that our methodology represents a viable and effective alternative for mobile robots operating in challenging outdoor environments or intrincate terrains.
执行移动机器人任务的环境通常具有不平坦地形的特点,包括室内外设置,其中包括阶梯和坡度。在这样的环境中,通常使用的 localization 方法遇到新的挑战和限制。本研究提出了一种包括地面高度和倾斜考虑的Localization 框架,不同于通常在这种情况下会失效的传统的2DLocalization paradigm。在我们的提议方法中,地图包括高度和空间占用信息,使用grid maps和Octo maps。同时,感知模型被设计以容纳机器人的倾斜姿态以及地面可能作为一个障碍物,除了常见的结构和动态障碍物。我们在Nav2中开发了并严格验证了我们的方法,这是一个以机器人导航著名的开源框架。我们的发现表明,我们的方法对于在挑战性的户外环境和崎岖地形上运行的移动机器人是一个可行的、有效的替代方案。
https://arxiv.org/abs/2309.12744
Omnidirectional camera is a cost-effective and information-rich sensor highly suitable for many marine applications and the ocean scientific community, encompassing several domains such as augmented reality, mapping, motion estimation, visual surveillance, and simultaneous localization and mapping. However, designing and constructing such a high-quality 360$^{\circ}$ real-time streaming camera system for underwater applications is a challenging problem due to the technical complexity in several aspects including sensor resolution, wide field of view, power supply, optical design, system calibration, and overheating management. This paper presents a novel and comprehensive system that addresses the complexities associated with the design, construction, and implementation of a fully functional 360$^{\circ}$ real-time streaming camera system specifically tailored for underwater environments. Our proposed system, UWA360CAM, can stream video in real time, operate in 24/7, and capture 360$^{\circ}$ underwater panorama images. Notably, our work is the pioneering effort in providing a detailed and replicable account of this system. The experiments provide a comprehensive analysis of our proposed system.
Omnidirectional camera是一种高效、信息丰富的传感器,对于许多海洋应用和海洋科学 community 非常合适,涵盖了多个领域,如增强现实、地图、运动估计、视觉监控和同时位置和地图定位。然而,设计并建造这样高质量的水下实时流媒体相机系统是一个挑战性的问题,因为多个方面存在技术复杂性,包括传感器分辨率、广视角、电源、光学设计、系统校准和过载管理。本文提出了一种创新和全面的系统,旨在解决设计和建造一个全功能360度实时流媒体相机系统,专门定制为水下环境的复杂问题。我们提出的系统称为 UWA360CAM,可以实时流媒体视频、24小时运行并保持系统校准,并捕捉360度水下全景图像。值得注意的是,我们的工作是提供详细和可重复性描述的先驱工作。实验提供了我们提出的系统的全面分析。
https://arxiv.org/abs/2309.12668
Trajectory tracking control of autonomous trolley collection robots (ATCR) is an ambitious work due to the complex environment, serious noise and external disturbances. This work investigates a control scheme for ATCR subjecting to severe environmental interference. A kinematics model based adaptive sliding mode disturbance observer with fast convergence is first proposed to estimate the lumped disturbances. On this basis, a robust controller with prescribed performance is proposed using a backstepping technique, which improves the transient performance and guarantees fast convergence. Simulation outcomes have been provided to illustrate the effectiveness of the proposed control scheme.
自主拖拽机器人收集机器人的路径跟踪控制是一项具有挑战性的项目,因为环境非常复杂、噪声非常严重以及外部干扰。该项目研究了受严重环境影响的ATCR控制方案。基于运动学模型的自适应滑动模式干扰观察器被提出,以估计积聚的干扰。基于这种方法,提出了一种具有规定性能的鲁棒控制器,使用回退技术,可以提高暂态性能并保证快速收敛。模拟结果提供了以说明所提出的控制方案有效性的示例。
https://arxiv.org/abs/2309.12660
Optimization-based safety filters, such as control barrier function (CBF) based quadratic programs (QPs), have demonstrated success in controlling autonomous systems to achieve complex goals. These CBF-QPs can be shown to be continuous, but are generally not smooth, let alone continuously differentiable. In this paper, we present a general characterization of smooth safety filters -- smooth controllers that guarantee safety in a minimally invasive fashion -- based on the Implicit Function Theorem. This characterization leads to families of smooth universal formulas for safety-critical controllers that quantify the conservatism of the resulting safety filter, the utility of which is demonstrated through illustrative examples.
基于优化的安全过滤器,如控制屏障函数(CBF)基于quadratic programs(QPs)的安全控制器,已经证明可以在控制自主系统以实现复杂目标方面取得成功。这些CBF-QPs可以证明是连续的,但通常不是平滑的,更不用说连续微分了。在本文中,我们基于Implicit Function Theorem提出了一种通用的描述平滑安全过滤器的方法——以最小 invasive方式保证安全性的平滑控制器。这种方法导致了一组安全关键控制器的平滑通用公式,这些公式量化了 resulting safety filter的保守性,并使用了举例来展示其有用性。
https://arxiv.org/abs/2309.12614
Recent transportation research suggests that autonomous vehicles (AVs) have the potential to improve traffic flow efficiency as they are able to maintain smaller car-following distances. Nevertheless, being a unique class of ground robots, AVs are susceptible to robotic errors, particularly in their perception module, leading to uncertainties in their movements and an increased risk of collisions. Consequently, conservative operational strategies, such as larger headway and slower speeds, are implemented to prioritize safety over traffic capacity in real-world operations. To reconcile the inconsistency, this paper proposes an analytical model framework that delineates the endogenous reciprocity between traffic safety and efficiency that arises from robotic uncertainty in AVs. Car-following scenarios are extensively examined, with uncertain headway as the key parameter for bridging the single-lane capacity and the collision probability. A Markov chain is then introduced to describe the dynamics of the lane capacity, and the resulting expected collision-inclusive capacity is adopted as the ultimate performance measure for fully autonomous traffic. With the help of this analytical model, it is possible to support the settings of critical parameters in AV operations and incorporate optimization techniques to assist traffic management strategies for autonomous traffic.
最近的运输研究表明,自动驾驶汽车(AVs)有提高交通流效率的潜力,因为它们能够保持较小的汽车跟随距离。然而,由于它们是地面机器人的一种独特类型,容易受到机器人错误,特别是其感知模块的错误,导致他们的运动不确定性增加,Collision 风险也增加。因此,采取保守的行为方式,例如更大的出发速度和更慢的速度,在现实世界行动中优先考虑安全性而次要考虑交通容量。为了调和一致性,本文提出了一种分析模型框架,该框架描述了由 AVs 中的机器人不确定性引起的交通安全和效率之间的自适应性反循环。对汽车跟随场景进行了深入研究,以确定单一车道容量和Collision 概率之间的关键参数。然后,引入马尔可夫链来描述车道容量的动态,并采用预期的最大Collision 包容能力作为完全自动驾驶交通的终极性能指标。借助这种方法,可以支持 AV 行动中关键参数的设置,并采用优化技术协助自动驾驶交通的 traffic 管理策略。
https://arxiv.org/abs/2309.12611
Surface electromyography (sEMG) and high-density sEMG (HD-sEMG) biosignals have been extensively investigated for myoelectric control of prosthetic devices, neurorobotics, and more recently human-computer interfaces because of their capability for hand gesture recognition/prediction in a wearable and non-invasive manner. High intraday (same-day) performance has been reported. However, the interday performance (separating training and testing days) is substantially degraded due to the poor generalizability of conventional approaches over time, hindering the application of such techniques in real-life practices. There are limited recent studies on the feasibility of multi-day hand gesture recognition. The existing studies face a major challenge: the need for long sEMG epochs makes the corresponding neural interfaces impractical due to the induced delay in myoelectric control. This paper proposes a compact ViT-based network for multi-day dynamic hand gesture prediction. We tackle the main challenge as the proposed model only relies on very short HD-sEMG signal windows (i.e., 50 ms, accounting for only one-sixth of the convention for real-time myoelectric implementation), boosting agility and responsiveness. Our proposed model can predict 11 dynamic gestures for 20 subjects with an average accuracy of over 71% on the testing day, 3-25 days after training. Moreover, when calibrated on just a small portion of data from the testing day, the proposed model can achieve over 92% accuracy by retraining less than 10% of the parameters for computational efficiency.
表面电感测量(sEMG)和高密度sEMG(HD-sEMG)生物信号已经被广泛研究用于肢体残疾控制、神经机器人学以及最近的人机接口,因为它们能够在佩戴且非侵入性的情况下进行手动作识别/预测。每日(当天)表现 据报道很高。然而,每日表现(区分训练和测试日)因传统方法的泛化性能较差而大幅度退化,阻碍将这些技术应用于实际实践中。目前,关于一天多次手动作识别的可行性研究有限。现有的研究面临一个主要挑战:需要长sEMG epochs导致相应的神经接口不可能实现,因为肌电控制引起的延迟。本 paper 提出了一种紧凑的ViT-based网络,用于一天多次的动态手动作预测。我们克服了主要挑战,因为 proposed 模型只需要非常短的HD-sEMG信号窗口(即50 ms,只占实时肌电实现的传统标准的六分之一),提高敏捷性和响应性。我们 proposed 模型可以预测20名 subjects 11种动态手势,在测试日,平均准确率超过71%,训练3-25天后。此外,当仅从测试日的数据中校准一小部分数据时,该模型可以实现超过92%的准确率,通过减少计算效率不到10%的参数重新训练。
https://arxiv.org/abs/2309.12602
In post-disaster scenarios, efficient search and rescue operations involve collaborative efforts between robots and humans. Existing planning approaches focus on specific aspects but overlook crucial elements like information gathering, task assignment, and planning. Furthermore, previous methods considering robot capabilities and victim requirements suffer from time complexity due to repetitive planning steps. To overcome these challenges, we introduce a comprehensive framework__the Multi-Stage Multi-Robot Task Assignment. This framework integrates scouting, task assignment, and path-planning stages, optimizing task allocation based on robot capabilities, victim requirements, and past robot performance. Our iterative approach ensures objective fulfillment within problem constraints. Evaluation across four maps, comparing with a state-of-the-art baseline, demonstrates our algorithm's superiority with a remarkable 97 percent performance increase. Our code is open-sourced to enable result replication.
在灾难后场景中,高效的搜索和救援行动需要机器人和人类的合作。现有的规划方法主要关注特定的方面,但忽略了信息收集、任务分配和规划等关键要素。此外,以前的方法和考虑机器人能力和受害者需求的方法由于重复规划步骤而产生的时间复杂性导致性能复杂度很高。为了克服这些挑战,我们提出了一个全面的框架—— Multi-Stage Multi-Robot Task Assignment。这个框架将搜索、任务分配和路径规划 stages 集成在一起,基于机器人能力、受害者需求和过去机器人表现来优化任务分配。我们的迭代方法保证了问题限制内的目标 fulfillment。在四个地图的评估中,与最先进的基准进行比较,我们算法的表现有显著的97%提高。我们的代码是开源的,以便结果的复制。
https://arxiv.org/abs/2309.12589
This paper presents a tutorial overview of path integral (PI) control approaches for stochastic optimal control and trajectory optimization. We concisely summarize the theoretical development of path integral control to compute a solution for stochastic optimal control and provide algorithmic descriptions of the cross-entropy (CE) method, an open-loop controller using the receding horizon scheme known as the model predictive path integral (MPPI), and a parameterized state feedback controller based on the path integral control theory. We discuss policy search methods based on path integral control, efficient and stable sampling strategies, extensions to multi-agent decision-making, and MPPI for the trajectory optimization on manifolds. For tutorial demonstrations, some PI-based controllers are implemented in MATLAB and ROS2/Gazebo simulations for trajectory optimization. The simulation frameworks and source codes are publicly available at this https URL.
本论文介绍了路径积分(PI)控制方法,用于 stochastic 最优控制和路径优化。我们简要总结了路径积分控制的理论基础,计算了 stochastic 最优控制的解决方案,并提供了交叉熵方法、一种使用渐近界 scheme 的开环控制器,即模型预测路径积分(MPPI),以及基于路径积分控制理论的参数化状态反馈控制器的算法描述。我们讨论了基于路径积分控制的政策搜索方法、高效稳定的采样策略、扩展到多agent决策和MPPI在多分支路径优化中的应用。为教学演示,一些 PI 控制器在 MATLAB 和 ROS2/Gazebo 模拟器中进行实现,用于路径优化。模拟器框架和源代码在此 https URL 上公开可用。
https://arxiv.org/abs/2309.12566
In an efficient and flexible human-robot collaborative work environment, a robot team member must be able to recognize both explicit requests and implied actions from human users. Identifying "what to do" in such cases requires an agent to have the ability to construct associations between objects, their actions, and the effect of actions on the environment. In this regard, semantic memory is being introduced to understand the explicit cues and their relationships with available objects and required skills to make "tea" and "sandwich". We have extended our previous hierarchical robot control architecture to add the capability to execute the most appropriate task based on both feedback from the user and the environmental context. To validate this system, two types of skills were implemented in the hierarchical task tree: 1) Tea making skills and 2) Sandwich making skills. During the conversation between the robot and the human, the robot was able to determine the hidden context using ontology and began to act accordingly. For instance, if the person says "I am thirsty" or "It is cold outside" the robot will start to perform the tea-making skill. In contrast, if the person says, "I am hungry" or "I need something to eat", the robot will make the sandwich. A humanoid robot Baxter was used for this experiment. We tested three scenarios with objects at different positions on the table for each skill. We observed that in all cases, the robot used only objects that were relevant to the skill.
在高效且灵活的人类机器人协作工作环境中,机器人团队成员必须能够从人类用户那里识别 both 明确请求和隐含行动。在这种情况下,确定“做什么”需要代理具有构建对象、其行动和环境行动之间的关联的能力。因此,正在引入语义记忆,以理解可用对象和做出“茶”和“三明治”等食品制作所需的技能的具体线索及其与物品和技能的关系。我们已经扩展了以前的层次结构机器人控制架构,增加了能够基于用户反馈和环境上下文执行最适当的任务的能力。为了验证此系统,在层次任务树中实施了两种技能:1) 茶制作技能和2) 三明治制作技能。在机器人和人类的交流中,机器人使用本体论来确定隐藏的上下文,并据此采取行动。例如,如果人会说“我口渴”或“外面很冷”,机器人就会开始执行茶制作技能。相反,如果人会说“我饿了”或“我需要吃点东西”,机器人就会做三明治。该机器人使用了一个类似人型的机器人 Baxter 进行实验。我们测试了三种情况,每个技能都有在桌子上不同位置的物品。我们观察到,在所有情况下,机器人只使用了与技能相关的物品。
https://arxiv.org/abs/2309.12562
While deep learning enables real robots to perform complex tasks had been difficult to implement in the past, the challenge is the enormous amount of trial-and-error and motion teaching in a real environment. The manipulation of moving objects, due to their dynamic properties, requires learning a wide range of factors such as the object's position, movement speed, and grasping timing. We propose a data augmentation method for enabling a robot to grasp moving objects with different speeds and grasping timings at low cost. Specifically, the robot is taught to grasp an object moving at low speed using teleoperation, and multiple data with different speeds and grasping timings are generated by down-sampling and padding the robot sensor data in the time-series direction. By learning multiple sensor data in a time series, the robot can generate motions while adjusting the grasping timing for unlearned movement speeds and sudden speed changes. We have shown using a real robot that this data augmentation method facilitates learning the relationship between object position and velocity and enables the robot to perform robust grasping motions for unlearned positions and objects with dynamically changing positions and velocities.
虽然在深度学习使得真正的机器人能够执行复杂的任务在过去非常困难,但挑战在于在实际环境中进行大量的试错和运动教学。由于物体的动态性质,需要进行大量的学习,如物体的位置、运动速度、抓握时机等。我们提出了一种数据增强方法,以使机器人以不同的速度和抓握时机抓住运动物体,从而降低成本。具体来说,机器人通过远程操作学习以低速度抓住一个物体,同时通过时间序列方向削减和填充机器人传感器数据,生成多个速度不同、抓握时机不同的数据。通过学习多个传感器数据的时间序列,机器人可以在未学习的运动速度和突然速度变化时生成运动。我们使用真实的机器人展示了这种方法可以帮助学习物体位置和速度之间的关系,并使机器人能够执行对未学习位置和动态变化位置和速度的物体的稳健抓握运动。
https://arxiv.org/abs/2309.12547
In human-robot collaboration, there has been a trade-off relationship between the speed of collaborative robots and the safety of human workers. In our previous paper, we introduced a time-optimal path tracking algorithm designed to maximize speed while ensuring safety for human workers. This algorithm runs in real-time and provides the safe and fastest control input for every cycle with respect to ISO standards. However, true optimality has not been achieved due to inaccurate distance computation resulting from conservative model simplification. To attain true optimality, we require a method that can compute distances 1. at many robot configurations to examine along a trajectory 2. in real-time for online robot control 3. as precisely as possible for optimal control. In this paper, we propose a batched, fast and precise distance checking method based on precomputed link-local SDFs. Our method can check distances for 500 waypoints along a trajectory within less than 1 millisecond using a GPU at runtime, making it suited for time-critical robotic control. Additionally, a neural approximation has been proposed to accelerate preprocessing by a factor of 2. Finally, we experimentally demonstrate that our method can navigate a 6-DoF robot earlier than a geometric-primitives-based distance checker in a dynamic and collaborative environment.
在人类机器人的合作中,合作机器人的速度与人类工人的安全之间存在一种权衡关系。在我们之前的文章中,我们介绍了一种时间最优路径跟踪算法,旨在最大化速度,同时确保人类工人的安全。该算法实时运行,并按照ISO标准为每个周期提供安全和最快的控制输入。然而,由于保守模型简化的不准确计算,并未实现真正的最优性。要实现真正的最优性,我们需要一种方法,它可以在多个机器人配置下计算距离,并在实时在线机器人控制中计算距离,以及尽可能准确地进行最优控制。在本文中,我们提出了一种基于预计算链接local SDF的批量快速精确距离检查方法。我们的方法使用GPU在运行时计算轨迹上的500个 Waypoints 的距离,小于1秒钟,使其适用于时间紧张的机器人控制。此外,我们提出了一种神经网络近似,以加速预处理的2倍速度。最后,我们实验证实,我们的方法可以在动态和协作环境中更快地导航6自由度机器人,比基于几何基本点的距离检查方法更早。
https://arxiv.org/abs/2309.12543
We propose a risk-aware crash mitigation system (RCMS), to augment any existing motion planner (MP), that enables an autonomous vehicle to perform evasive maneuvers in high-risk situations and minimize the severity of collision if a crash is inevitable. In order to facilitate a smooth transition between RCMS and MP, we develop a novel activation mechanism that combines instantaneous as well as predictive collision risk evaluation strategies in a unified hysteresis-band approach. For trajectory planning, we deploy a modular receding horizon optimization-based approach that minimizes a smooth situational risk profile, while adhering to the physical road limits as well as vehicular actuator limits. We demonstrate the performance of our approach in a simulation environment.
我们提议了一种风险意识的崩溃缓解系统(RCMS),以补充任何现有的运动规划器(MP),使自动驾驶车辆能够在高风险情况下进行规避动作,并如果发生车祸则尽可能减轻碰撞的严重程度。为了便于RCMS和MP的平滑过渡,我们开发了一种独特的激活机制,它采用即时和预测的碰撞风险评估策略,在一个统一的Hyssible Band方法中综合应用。对于路径规划,我们采用了模块化延期 horizon 优化方法,最大限度地减少平滑情境风险概况,同时遵守物理道路限制和车辆致动限制。我们在一个模拟环境中演示了我们的 approach 的性能。
https://arxiv.org/abs/2309.12531