Although reinforcement learning has seen tremendous success recently, this kind of trial-and-error learning can be impractical or inefficient in complex environments. The use of demonstrations, on the other hand, enables agents to benefit from expert knowledge rather than having to discover the best action to take through exploration. In this survey, we discuss the advantages of using demonstrations in sequential decision making, various ways to apply demonstrations in learning-based decision making paradigms (for example, reinforcement learning and planning in the learned models), and how to collect the demonstrations in various scenarios. Additionally, we exemplify a practical pipeline for generating and utilizing demonstrations in the recently proposed ManiSkill robot learning benchmark.
尽管强化学习最近取得了巨大的成功,但这种试错学习在复杂环境中可能是不实用的或效率不高的。另一方面,使用演示可以使代理从专家知识中获得利益,而不是必须通过探索发现最佳行动。在本调查中,我们讨论了在顺序决策中使用演示的优缺点,以及在基于学习的决策范式中(例如,在学习模型中的强化学习和规划)使用演示的各种方法,并讨论了在各种情况下收集演示的方法。此外,我们示例了一个实用的管道,用于在最近提出的ManiSkill机器人学习基准中生成和利用演示。
https://arxiv.org/abs/2303.13489
Recent advances in learning-based approaches have led to impressive dexterous manipulation capabilities. Yet, we haven't witnessed widespread adoption of these capabilities beyond the laboratory. This is likely due to practical limitations, such as significant computational burden, inscrutable policy architectures, sensitivity to parameter initializations, and the considerable technical expertise required for implementation. In this work, we investigate the utility of Koopman operator theory in alleviating these limitations. Koopman operators are simple yet powerful control-theoretic structures that help represent complex nonlinear dynamics as linear systems in higher-dimensional spaces. Motivated by the fact that complex nonlinear dynamics underlie dexterous manipulation, we develop an imitation learning framework that leverages Koopman operators to simultaneously learn the desired behavior of both robot and object states. We demonstrate that a Koopman operator-based framework is surprisingly effective for dexterous manipulation and offers a number of unique benefits. First, the learning process is analytical, eliminating the sensitivity to parameter initializations and painstaking hyperparameter optimization. Second, the learned reference dynamics can be combined with a task-agnostic tracking controller such that task changes and variations can be handled with ease. Third, a Koopman operator-based approach can perform comparably to state-of-the-art imitation learning algorithms in terms of task success rate and imitation error, while being an order of magnitude more computationally efficient. In addition, we discuss a number of avenues for future research made available by this work.
最近在基于学习的方法方面的进展已经带来了令人印象深刻的灵巧操纵能力。然而,我们在实验室以外并未观察到这些能力的普及。这可能是由于实际限制,例如巨大的计算负担、难以解释的政策架构、对参数初始化的敏感性以及实现所需的相当专业的技术知识。在本文中,我们研究了科恩代理理论在减轻这些限制方面的应用价值。科恩代理是简单但强大的控制理论结构,在更高维度的空间中帮助将复杂的非线性动态表现为线性系统。鉴于复杂的非线性动态是灵巧操纵的基础,我们开发了一个模仿学习框架,利用科恩代理来同时学习机器人和对象状态所需的期望行为。我们证明了科恩代理框架在灵巧操纵方面出乎意料有效,并提供了多项独特的好处。首先,学习过程是分析的,消除了对参数初始化的敏感性和繁琐的超参数优化。其次, learned reference dynamics可以与任务无关跟踪控制器一起使用,从而使任务变化和变异可以轻松处理。第三,基于科恩代理的方法可以在任务成功率和模仿误差方面与最先进的模仿学习算法相当,但计算效率更高。此外,我们讨论了本工作提供的一系列未来研究途径。
https://arxiv.org/abs/2303.13446
As robots become more prevalent, optimizing their design for better performance and efficiency is becoming increasingly important. However, current robot design practices overlook the impact of perception and design choices on a robot's learning capabilities. To address this gap, we propose a comprehensive methodology that accounts for the interplay between the robot's perception, hardware characteristics, and task requirements. Our approach optimizes the robot's morphology holistically, leading to improved learning and task execution proficiency. To achieve this, we introduce a Morphology-AGnostIc Controller (MAGIC), which helps with the rapid assessment of different robot designs. The MAGIC policy is efficiently trained through a novel PRIvileged Single-stage learning via latent alignMent (PRISM) framework, which also encourages behaviors that are typical of robot onboard observation. Our simulation-based results demonstrate that morphologies optimized holistically improve the robot performance by 15-20% on various manipulation tasks, and require 25x less data to match human-expert made morphology performance. In summary, our work contributes to the growing trend of learning-based approaches in robotics and emphasizes the potential in designing robots that facilitate better learning.
机器人的普及使得优化设计以改善性能和效率变得越来越重要。然而,当前机器人设计实践忽视了感知和设计选择对机器人学习能力的影响。为了解决这一差距,我们提出了一种综合方法,考虑了机器人感知、硬件特性和任务要求之间的交互作用。我们的算法优化了机器人的整体形态,从而提高了学习和任务执行的 proficiency。为了实现这一点,我们引入了形态学适应控制器(Magic),该控制器可以帮助快速评估不同机器人设计。 Magic 策略通过一种新颖的基于隐式对齐(PRISM)框架的单一阶段学习方法进行高效训练,同时也鼓励机器人内部观察的典型行为。我们的模拟结果显示,整体优化机器人形态可以在各种操纵任务中提高性能 by 15-20%,而只需要比人类专家形态表现所需的数据少25倍。总之,我们的工作为机器人领域的基于学习的方法趋势做出了贡献,并强调了设计机器人以促进更好学习的潜力。
https://arxiv.org/abs/2303.13390
We are interested in pick-and-place style robot manipulation tasks in cluttered and confined 3D workspaces among movable objects that may be rearranged by the robot and may slide, tilt, lean or topple. A recently proposed algorithm, M4M, determines which objects need to be moved and where by solving a Multi-Agent Pathfinding MAPF abstraction of this problem. It then utilises a nonprehensile push planner to compute actions for how the robot might realise these rearrangements and a rigid body physics simulator to check whether the actions satisfy physics constraints encoded in the problem. However, M4M greedily commits to valid pushes found during planning, and does not reason about orderings over pushes if multiple objects need to be rearranged. Furthermore, M4M does not reason about other possible MAPF solutions that lead to different rearrangements and pushes. In this paper, we extend M4M and present Enhanced-M4M (E-M4M) -- a systematic graph search-based solver that searches over orderings of pushes for movable objects that need to be rearranged and different possible rearrangements of the scene. We introduce several algorithmic optimisations to circumvent the increased computational complexity, discuss the space of problems solvable by E-M4M and show that experimentally, both on the real robot and in simulation, it significantly outperforms the original M4M algorithm, as well as other state-of-the-art alternatives when dealing with complex scenes.
我们感兴趣的是,在繁忙且狭窄的三维工作空间中,由机器人移动可移动对象组成的捉拿放置式机器人操作任务。最近提出的一种算法M4M,通过解决这个问题的多项式路径搜索MAPF抽象,确定需要移动哪些对象以及它们在哪些位置移动。然后使用非手触推进规划器计算机器人可能实现这些重构的方式,并使用Rigidbody物理模拟器检查这些行动是否满足问题中所编码的物理约束。然而,M4M贪婪地 commitment 到规划期间找到的合法推进,如果多个对象需要重构,则不会考虑推进的顺序。此外,M4M 不会考虑其他可能的解决方案,导致不同的重构和推进。在本文中,我们扩展了 M4M 并提出了增强型 M4M(E-M4M),它是一种基于系统 graph 搜索的求解器,搜索可移动对象需要重构的顺序以及场景的不同重构方案。我们介绍了几个算法优化,以绕过增加的计算复杂性,并讨论了 E-M4M 可以解决的问题的范围,并证明在实验中,无论在真实机器人上还是在模拟中,E-M4M 在处理复杂场景时显著优于原始 M4M 算法和其他先进的替代品。
https://arxiv.org/abs/2303.13385
Planning and control for uncertain contact systems is challenging as it is not clear how to propagate uncertainty for planning. Contact-rich tasks can be modeled efficiently using complementarity constraints among other techniques. In this paper, we present a stochastic optimization technique with chance constraints for systems with stochastic complementarity constraints. We use a particle filter-based approach to propagate moments for stochastic complementarity system. To circumvent the issues of open-loop chance constrained planning, we propose a contact-aware controller for covariance steering of the complementarity system. Our optimization problem is formulated as Non-Linear Programming (NLP) using bilevel optimization. We present an important-particle algorithm for numerical efficiency for the underlying control problem. We verify that our contact-aware closed-loop controller is able to steer the covariance of the states under stochastic contact-rich tasks.
对不确定接触系统的规划和控制是挑战性的,因为不清楚如何传播不确定性用于规划。接触丰富的任务可以通过互补约束其他技巧高效建模。在本文中,我们提出了一种随机优化技巧,具有随机机会约束的系统。我们使用粒子滤波方法传播 Moments 对随机互补系统。为了绕过开放循环机会约束规划的问题,我们提出了一个接触aware控制器,用于covariance 指导互补系统。我们的优化问题使用双水平优化提出了一个重要的粒子算法,用于提高底层控制问题的数值效率。我们验证,我们的接触aware闭环控制器能够在接触丰富的任务下指导状态covariance。
https://arxiv.org/abs/2303.13382
Within academia and industry, there has been a need for expansive simulation frameworks that include model-based simulation of sensors, mobile vehicles, and the environment around them. To this end, the modular, real-time, and open-source AirSim framework has been a popular community-built system that fulfills some of those needs. However, the framework required adding systems to serve some complex industrial applications, including designing and testing new sensor modalities, Simultaneous Localization And Mapping (SLAM), autonomous navigation algorithms, and transfer learning with machine learning models. In this work, we discuss the modification and additions to our open-source version of the AirSim simulation framework, including new sensor modalities, vehicle types, and methods to generate realistic environments with changeable objects procedurally. Furthermore, we show the various applications and use cases the framework can serve.
在学术界和工业界,需要有扩展性的模拟框架,其中包括基于模型的传感器、移动车辆及其周围环境的模拟。为此,模块化、实时且开源的AirSim框架已成为一个受欢迎的社区构建系统,满足了其中一些需求。然而,框架需要添加系统以服务一些复杂的工业应用,包括设计和测试新的传感器模式、同时定位和地图(SLAM)、自主导航算法以及与机器学习模型的转移学习。在这项工作中,我们讨论了我们开源版本的AirSim模拟框架的修改和添加,包括新的传感器模式、车辆类型和方法,以生成具有可变化对象的实际环境。此外,我们展示了框架可以服务的多种应用和 use cases。
https://arxiv.org/abs/2303.13381
Real-world manipulation problems in heavy clutter require robots to reason about potential contacts with objects in the environment. We focus on pick-and-place style tasks to retrieve a target object from a shelf where some `movable' objects must be rearranged in order to solve the task. In particular, our motivation is to allow the robot to reason over and consider non-prehensile rearrangement actions that lead to complex robot-object and object-object interactions where multiple objects might be moved by the robot simultaneously, and objects might tilt, lean on each other, or topple. To support this, we query a physics-based simulator to forward simulate these interaction dynamics which makes action evaluation during planning computationally very expensive. To make the planner tractable, we establish a connection between the domain of Manipulation Among Movable Objects and Multi-Agent Pathfinding that lets us decompose the problem into two phases our M4M algorithm iterates over. First we solve a multi-agent planning problem that reasons about the configurations of movable objects but does not forward simulate a physics model. Next, an arm motion planning problem is solved that uses a physics-based simulator but does not search over possible configurations of movable objects. We run simulated and real-world experiments with the PR2 robot and compare against relevant baseline algorithms. Our results highlight that M4M generates complex 3D interactions, and solves at least twice as many problems as the baselines with competitive performance.
在堆积如山的复杂物品中,实际的操作问题需要机器人考虑与环境中的对象的可能接触。我们专注于从拿起和放置任务中取出目标对象,在需要重构一些“可移动”对象才能够完成任务的货架上操作。特别是,我们的目标是让机器人考虑非接触性重构行动,导致复杂的机器人-对象和对象-对象相互作用,其中多个对象可能会被机器人同时移动,对象可能会倾斜、相互倚靠或倒塌。为了支持这一点,我们查询了一个基于物理的模拟器, forward simulate 这些相互作用的动态,这使得在规划期间计算代价很高。为了让规划变得可计算,我们建立了操纵可移动对象领域的连接,并将其与多Agent路径搜索领域联系起来,让我们可以将问题分解为两个阶段,我们的M4M算法迭代处理。我们首先解决了一个多Agent规划问题,它考虑可移动对象的配置但并未forward simulate一个物理模型。接下来,我们解决了一个手臂运动规划问题,它使用一个基于物理的模拟器但并未搜索可移动对象的可能配置。我们使用PR2机器人进行了模拟和现实世界实验,并与相关基准算法进行比较。我们的结果显示,M4M生成复杂的三维交互,并解决了至少与基准算法竞争性能的两倍数量的问题。
https://arxiv.org/abs/2303.13352
Pre-defined manipulation primitives are widely used for cloth manipulation. However, cloth properties such as its stiffness or density can highly impact the performance of these primitives. Although existing solutions have tackled the parameterisation of pick and place locations, the effect of factors such as the velocity or trajectory of quasi-static and dynamic manipulation primitives has been neglected. Choosing appropriate values for these parameters is crucial to cope with the range of materials present in house-hold cloth objects. To address this challenge, we introduce the Quasi-Dynamic Parameterisable (QDP) method, which optimises parameters such as the motion velocity in addition to the pick and place positions of quasi-static and dynamic manipulation primitives. In this work, we leverage the framework of Sequential Reinforcement Learning to decouple sequentially the parameters that compose the primitives. To evaluate the effectiveness of the method we focus on the task of cloth unfolding with a robotic arm in simulation and real-world experiments. Our results in simulation show that by deciding the optimal parameters for the primitives the performance can improve by 20% compared to sub-optimal ones. Real-world results demonstrate the advantage of modifying the velocity and height of manipulation primitives for cloths with different mass, stiffness, shape and size. Supplementary material, videos, and code, can be found at this https URL.
预定义的操作基本单位被广泛应用于对布料进行操作。然而,布料属性,如其硬度或密度,可以高度影响这些基本单位的性能。尽管现有的解决方案已经解决了选择和放置位置的参数化问题,但类似于 quasi-static 和 dynamic 操作基本单位的速度和轨迹等因素的影响被忽视了。选择适当的参数对于应对家居纺织品材料的多样性是至关重要的。为了解决这个问题,我们引入了Quasi-Dynamic 参数可变数(QDP)方法,该方法优化了类似于 quasi-static 和 dynamic 操作基本单位的移动速度和位置。在这项工作中,我们利用序列强化学习框架来分解组成基本单位的参数。为了评估方法的有效性,我们重点考虑使用机器人手臂展开布料的任务,并在模拟和现实世界的实验中对其进行评估。模拟结果显示,通过决定基本单位的最佳参数,性能可以比最优值提高20%。现实世界结果显示,实际结果表明,改变操作基本单位的速度和高度对于不同质量和硬度、形状和大小的布料具有改变操作基本单位速度和高度的优势。补充材料、视频和代码可在本网站 https URL 中找到。
https://arxiv.org/abs/2303.13320
This work presents a novel RGB-D-inertial dynamic SLAM method that can enable accurate localisation when the majority of the camera view is occluded by multiple dynamic objects over a long period of time. Most dynamic SLAM approaches either remove dynamic objects as outliers when they account for a minor proportion of the visual input, or detect dynamic objects using semantic segmentation before camera tracking. Therefore, dynamic objects that cause large occlusions are difficult to detect without prior information. The remaining visual information from the static background is also not enough to support localisation when large occlusion lasts for a long period. To overcome these problems, our framework presents a robust visual-inertial bundle adjustment that simultaneously tracks camera, estimates cluster-wise dense segmentation of dynamic objects and maintains a static sparse map by combining dense and sparse features. The experiment results demonstrate that our method achieves promising localisation and object segmentation performance compared to other state-of-the-art methods in the scenario of long-term large occlusion.
这项工作提出了一种 novel RGB-D-inertial 动态 SLAM 方法,能够在长时间内多个动态物体遮挡大部分摄像头视图的情况下实现准确的定位。大多数动态 SLAM 方法要么在动态物体占据视觉输入的较小比例时将其视为异常值并删除,要么在跟踪摄像头之前使用语义分割方法检测动态物体。因此,在没有先前信息的情况下难以检测造成大规模遮挡的动态物体。在长时间大规模遮挡的情况下,剩余的静态背景视觉信息不足以支持定位。因此,我们框架提出了一种稳健的视觉-inertial Bundle 调整方法,可以同时跟踪摄像头并估计动态物体的密集群组分割,并通过结合密集和稀疏特征维持静态稀疏地图。实验结果显示,与我们在其他长期大规模遮挡场景中使用的先进方法相比,我们的方法实现了 promising Localization 和物体分割性能。
https://arxiv.org/abs/2303.13316
We present a novel technique to estimate the 6D pose of objects from single images where the 3D geometry of the object is only given approximately and not as a precise 3D model. To achieve this, we employ a dense 2D-to-3D correspondence predictor that regresses 3D model coordinates for every pixel. In addition to the 3D coordinates, our model also estimates the pixel-wise coordinate error to discard correspondences that are likely wrong. This allows us to generate multiple 6D pose hypotheses of the object, which we then refine iteratively using a highly efficient region-based approach. We also introduce a novel pixel-wise posterior formulation by which we can estimate the probability for each hypothesis and select the most likely one. As we show in experiments, our approach is capable of dealing with extreme visual conditions including overexposure, high contrast, or low signal-to-noise ratio. This makes it a powerful technique for the particularly challenging task of estimating the pose of tumbling satellites for in-orbit robotic applications. Our method achieves state-of-the-art performance on the SPEED+ dataset and has won the SPEC2021 post-mortem competition.
我们提出了一种 novel 技术,用于从单个图像中估计物体的 6D 姿态,其中物体的 3D 几何只给出近似值,而不是精确的 3D 模型。为了实现这一目标,我们使用了一种Dense 2D-to-3D 对应预测器,该预测器对每个像素的 3D 模型坐标进行回归。除了 3D 坐标,我们的模型还估计了像素坐标错误,以排除可能不正确的对应关系。这允许我们生成多个物体的 6D 姿态假设,然后使用高效的区域方法迭代地优化。我们还引入了一种 novel 像素后处理方法,可以估计每个假设的概率,并选择最可能的那个。正如在实验中所示,我们的方法可以处理极端的视觉条件,包括过曝、高对比度或低信号-to-噪声比。这使得它成为估计在轨道机器人应用中翻滚卫星姿态的特别挑战性任务的强大技术。我们的方法在 SPEED+ 数据集上取得了最先进的性能,并赢得了 SPEC2021 post-mortem competition。
https://arxiv.org/abs/2303.13241
Pedestrian occlusion is challenging for autonomous vehicles (AVs) at midblock locations on multilane roadways because an AV cannot detect crossing pedestrians that are fully occluded by downstream vehicles in adjacent lanes. This paper tests the capability of vehicle-to-vehicle (V2V) communication between an AV and its downstream vehicles to share midblock pedestrian crossings information. The researchers developed a V2V-based collision-avoidance decision strategy and compared it to a base scenario (i.e., decision strategy without the utilization of V2V). Simulation results showed that for the base scenario, the near-zero time-to-collision (TTC) indicated no time for the AV to take appropriate action and resulted in dramatic braking followed by collisions. But the V2V-based collision-avoidance decision strategy allowed for a proportional braking approach to increase the TTC allowing the pedestrian to cross safely. To conclude, the V2V-based collision-avoidance decision strategy has higher safety benefits for an AV interacting with fully occluded pedestrians at midblock locations on multilane roadways.
行人阻塞对无人驾驶汽车(AV)在多车道道路的中线附近位置是非常困难的,因为AV无法检测相邻车道上完全阻塞的行人。本文测试了AV及其下游车辆的车对车通信能力,以分享中线附近行人穿越信息。研究人员开发了基于V2V的避免碰撞决策策略,并将其与基情假设进行比较(即不使用V2V的决策策略)。模拟结果显示,对于基情假设,接近零的碰撞避免时间(TTC)表示AV没有时间采取适当行动,导致戏剧性的刹车和碰撞。但基于V2V的避免碰撞决策策略允许按比例刹车,以增加TTC,从而使行人能够安全通过。因此,结论是,基于V2V的避免碰撞决策策略对于在多车道道路的中线附近与完全阻塞的行人交互的AV有更大的安全性好处。
https://arxiv.org/abs/2303.13032
Human-machine interaction (HMI) and human-robot interaction (HRI) can assist structural monitoring and structural dynamics testing in the laboratory and field. In vibratory experimentation, one mode of generating vibration is to use electrodynamic exciters. Manual control is a common way of setting the input of the exciter by the operator. To measure the structural responses to these generated vibrations sensors are attached to the structure. These sensors can be deployed by repeatable robots with high endurance, which require on-the-fly control. If the interface between operators and the controls was augmented, then operators can visualize the experiments, exciter levels, and define robot input with a better awareness of the area of interest. Robots can provide better aid to humans if intelligent on-the-fly control of the robot is: (1) quantified and presented to the human; (2) conducted in real-time for human feedback informed by data. Information provided by the new interface would be used to change the control input based on their understanding of real-time parameters. This research proposes using Augmented Reality (AR) applications to provide humans with sensor feedback and control of actuators and robots. This method improves cognition by allowing the operator to maintain awareness of structures while adjusting conditions accordingly with the assistance of the new real-time interface. One interface application is developed to plot sensor data in addition to voltage, frequency, and duration controls for vibration generation. Two more applications are developed under similar framework, one to control the position of a mediating robot and one to control the frequency of the robot movement. This paper presents the proposed model for the new control loop and then compares the new approach with a traditional method by measuring time delay in control input and user efficiency.
人机交互(HMI)和人机交互(HRI)可以在实验室和实地帮助进行结构监测和结构动力学测试。在振动实验中,一种产生振动的模式是利用电热激发器。手动控制是一种常见的方式,由操作员设置激发器的输入。为了测量这些产生振动的结构响应,传感器被安装在结构中。这些传感器可以重复使用机器人上具有高耐力的机器人部署,这需要实时控制。如果操作员与控制台之间的界面被增强,则操作员可以更好地可视化实验、激发器水平,并定义机器人输入,更好地了解感兴趣的区域。如果机器人的实时智能控制是:(1)量化并呈现给人类;(2)通过数据 inform 人类实时反馈。新界面提供的信息将被用于改变控制输入,基于他们对实时参数的理解。此研究建议使用增强现实(AR)应用程序为提供传感器反馈和驱动控制器和机器人的控制。这种方法可以提高认知,允许操作员在借助新实时界面的同时,保持对结构的注意。一个界面应用程序将被开发来绘制传感器数据,除了电压、频率和持续时间的控制外,还用于振动生成。另外两个应用程序将在类似框架下开发,一个控制中介机器人的位置,一个控制机器人的运动频率。本文提出了新控制循环的模型,然后通过测量控制输入和时间延迟比较传统方法和新方法的效率。
https://arxiv.org/abs/2303.13016
Dropped into an unknown environment, what should an agent do to quickly learn about the environment and how to accomplish diverse tasks within it? We address this question within the goal-conditioned reinforcement learning paradigm, by identifying how the agent should set its goals at training time to maximize exploration. We propose "Planning Exploratory Goals" (PEG), a method that sets goals for each training episode to directly optimize an intrinsic exploration reward. PEG first chooses goal commands such that the agent's goal-conditioned policy, at its current level of training, will end up in states with high exploration potential. It then launches an exploration policy starting at those promising states. To enable this direct optimization, PEG learns world models and adapts sampling-based planning algorithms to "plan goal commands". In challenging simulated robotics environments including a multi-legged ant robot in a maze, and a robot arm on a cluttered tabletop, PEG exploration enables more efficient and effective training of goal-conditioned policies relative to baselines and ablations. Our ant successfully navigates a long maze, and the robot arm successfully builds a stack of three blocks upon command. Website: this https URL
掉进未知的环境,一个agent应该做什么来快速学习环境,以及在环境中完成各种任务?我们在目标条件强化学习范式内解决这个问题,通过确定agent应该在训练时设置目标,以最大限度地探索。我们提出了“规划探索目标”(PEG),一种方法,为每个训练集设置目标,直接优化内在的探索奖励。PEG首先选择目标命令,使得agent的目标条件策略,在其当前训练水平下,最终会到达具有高探索潜力的状态。然后从这些充满希望的状态开始探索政策。为了实现这种直接优化,PEG学习世界模型,并适应“规划目标命令”的采样规划算法。在包括在迷宫中的多足蚂蚁机器人和在杂乱桌面上的机器人臂的挑战性模拟机器人环境中,PEG探索使相对于基准和消除的梯度的训练目标条件策略更加高效和有效。我们的蚂蚁成功地 navigate 了一个长迷宫,机器人臂成功地建造了一组三个块的建筑。网站: this https URL
https://arxiv.org/abs/2303.13002
Compliant grippers, owing to adaptivity and safety, have attracted considerable attention for unstructured grasping in real applications, such as industrial or logistic scenarios. However, accurate construction of the mathematical model depicting the bidirectional relationship between shape deformation and contact force for such grippers, such as the Fin-Ray grippers, remains stagnant to date. To address this research gap, this article devises, presents, and experimentally validates a universal bidirectional force-displacement mathematical model for compliant grippers based on the co-rotational concept, which endows such grippers with an intrinsic force sensing capability and offers a better insight into the design optimization. In Part 1 of the article, we introduce the fundamental theory of the co-rotational approach, where arbitrary large deformation of beam elements can be modeled. Its intrinsic principle enables the theoretical modeling to consider various types of configurations and key design parameters with very few assumptions made. Further, a force control algorithm is proposed, providing accurate displacement estimations of the gripper under external forces with minor computational loads. The performance of the proposed method is experimentally verified through comparison with Finite Element Analysis, where the influence of four key design parameters on the gripper s performance is investigated, facilitating systematical design optimization. Part 2 of this article demonstrating the force sensing capabilities and the effects of representative co-rotational modeling parameters on model accuracy is released in Google Drive.
符合要求的抓握手具有适应性和安全性,因此在实际应用中,如工业或物流场景,对无结构抓取吸引了相当的注意力。然而,准确构建数学模型描述这种符合要求的抓握手,如Fin-Ray抓握手,的双向形状变形和接触力的关系,迄今为止仍然停滞不前。为了解决这一研究空白,本文提出了一种实验验证过的通用双向力量-位移数学模型,基于共旋转概念,赋予这种抓握手具有内在的力量感知能力,并提供更好的设计优化的见解。本文第一部分介绍了共旋转方法的基本理论,其中可以任意大的形状变形建模。其内在原理使理论建模可以考虑各种配置类型和关键设计参数,只需要少量的假设。此外,提出了一种力量控制算法,提供在外部力量下准确的位置估计,只需要轻微的计算负载。该方法的性能通过与有限元分析的比较进行了实验验证,研究了四个关键设计参数对抓握手性能的影响,从而促进了系统级设计优化。本文第二部分展示了力量感知能力和代表性共旋转建模参数对模型精度的影响,将其发布在Google Drive中。
https://arxiv.org/abs/2303.12987
Control Barrier Functions offer safety certificates by dictating controllers that enforce safety constraints. However, their response depends on the classK function that is used to restrict the rate of change of the barrier function along the system trajectories. This paper introduces the notion of Rate Tunable Control Barrier Function (RT-CBF), which allows for online tuning of the response of CBF-based controllers. In contrast to the existing CBF approaches that use a fixed (predefined) classK function to ensure safety, we parameterize and adapt the classK function parameters online. Furthermore, we discuss the challenges associated with multiple barrier constraints, namely ensuring that they admit a common control input that satisfies them simultaneously for all time. In practice, RT-CBF enables designing parameter dynamics for (1) a better-performing response, where performance is defined in terms of the cost accumulated over a time horizon, or (2) a less conservative response. We propose a model-predictive framework that computes the sensitivity of the future states with respect to the parameters and uses Sequential Quadratic Programming for deriving an online law to update the parameters in the direction of improving the performance. When prediction is not possible, we also provide point-wise sufficient conditions to be imposed on any user-given parameter dynamics so that multiple CBF constraints continue to admit common control input with time. Finally, we introduce RT-CBFs for decentralized uncooperative multi-agent systems, where a trust factor, computed based on the instantaneous ease of constraint satisfaction, is used to update parameters online for a less conservative response.
控制障碍函数通过指定执行安全约束的控制控制器来提供安全证书。然而,其响应取决于 classK 函数,用于限制系统路径上障碍函数的变化率。本文介绍了 Rate Tunable Control Barrier Function(RT-CBF)的概念,以便在线调整基于 CBF 的控制控制器的响应。与现有的 CBF 方法,该方法使用一个固定(预先定义)的 classK 函数以确保安全,我们参数化并适应 classK 函数参数在线。此外,我们讨论了多个障碍约束所面临的挑战,即确保它们承认一个共同的控制输入,使其在整个时间范围内同时满足它们。在实践中,RT-CBF 允许设计参数动态特性,以(1) 实现更好的响应性能,性能以累积的成本为定义标准,或(2) 实现更保守的响应。我们提出了一个模型预测框架,计算未来状态的敏感性与参数之间的关系,并使用Sequential Quadratic Programming 推导在线 law 更新参数以改善性能。当预测不可用时,我们也提供点充分条件,应将其施加给任何用户提供的参数动态特性,以确保多个 CBF 约束继续承认共同的控制输入。最后,我们介绍了 RT-CBF 用于分散的不合作多Agent系统,其中基于实时满足约束条件的易用性计算的信任因子用于更新参数以实现更保守的响应。
https://arxiv.org/abs/2303.12966
Data collected at Hurricane Ian (2022) quantifies the demands that small uncrewed aerial systems (UAS), or drones, place on the network communication infrastructure and identifies gaps in the field. Drones have been increasingly used since Hurricane Katrina (2005) for disaster response, however getting the data from the drone to the appropriate decision makers throughout incident command in a timely fashion has been problematic. These delays have persisted even as countries such as the USA have made significant investments in wireless infrastructure, rapidly deployable nodes, and an increase in commercial satellite solutions. Hurricane Ian serves as a case study of the mismatch between communications needs and capabilities. In the first four days of the response, nine drone teams flew 34 missions under the direction of the State of Florida FL-UAS1, generating 636GB of data. The teams had access to six different wireless communications networks but had to resort to physically transferring data to the nearest intact emergency operations center in order to make the data available to the relevant agencies. The analysis of the mismatch contributes a model of the drone data-to-decision workflow in a disaster and quantifies wireless network communication requirements throughout the workflow in five factors. Four of the factors-availability, bandwidth, burstiness, and spatial distribution-were previously identified from analyses of Hurricanes Harvey (2017) and Michael (2018). This work adds upload rate as a fifth attribute. The analysis is expected to improve drone design and edge computing schemes as well as inform wireless communication research and development.
在2022年的飓风伊万(Ian)收集的数据量化了无人机在网络安全基础设施上的要求,并发现了实地中的缺口。自飓风卡特里娜( Katrina)以来,无人机越来越常用于灾难响应,然而,及时将无人机数据发送给适当的决策制定者在整个事件指挥过程中一直是一个问题。这些延迟即使在像美国等国家对无线基础设施、可以快速部署节点和商业卫星解决方案的大规模投资仍然存在。Ian飓风用作通信需求和能力不匹配的案例分析。在响应的前四天中,九架无人机团队执行了34次任务,由佛罗里达州FL-UAS1号州指导,生成了636GB的数据。团队可以访问六个不同的无线通信网络,但不得不采取物理方式将数据转移到最近的完整紧急行动中心,以便将数据提供给相关机构。该分析有助于构建无人机在灾害中数据-决策工作流程模型,并量化在整个工作流程中的无线网络安全需求。四个因素-可用性、带宽、爆发性和空间分布-从飓风哈瓦那( Harvey)和迈克尔(Michael)的分析中已确定。该工作还增加了上传速率作为第五个属性。该分析预计可以改善无人机设计和边缘计算方案,并通知无线通信研究和发展。
https://arxiv.org/abs/2303.12937
Crash data of autonomous vehicles (AV) or vehicles equipped with advanced driver assistance systems (ADAS) are the key information to understand the crash nature and to enhance the automation systems. However, most of the existing crash data sources are either limited by the sample size or suffer from missing or unverified data. To contribute to the AV safety research community, we introduce AVOID: an open AV crash dataset. Three types of vehicles are considered: Advanced Driving System (ADS) vehicles, Advanced Driver Assistance Systems (ADAS) vehicles, and low-speed autonomous shuttles. The crash data are collected from the National Highway Traffic Safety Administration (NHTSA), California Department of Motor Vehicles (CA DMV) and incident news worldwide, and the data are manually verified and summarized in ready-to-use format. In addition, land use, weather, and geometry information are also provided. The dataset is expected to accelerate the research on AV crash analysis and potential risk identification by providing the research community with data of rich samples, diverse data sources, clear data structure, and high data quality.
无人驾驶车辆(AV)或装备先进驾驶辅助系统(ADAS)的车辆的事故数据是理解事故性质和提高自动化系统的关键技术信息。然而,大多数现有的事故数据来源都受到样本大小的限制,或者存在缺失或未验证的数据。为了为AV安全研究社区做出贡献,我们介绍了AVOID:一个开放的AV事故数据集。考虑了三种车辆类型:先进的驾驶系统(ADS)车辆、先进的驾驶辅助系统(ADAS)车辆和低速无人驾驶公交车。事故数据从全国高速公路交通安全管理局(NHTSA)、加利福尼亚州汽车管理局(CA DMV)和世界各地的新闻中收集,数据手动验证和总结,以 ready-to-use 格式呈现。此外,土地使用、天气和几何信息也提供了。预计数据集将加速AV事故分析和潜在风险识别的研究,通过提供丰富的样本、多样化的数据来源、清晰的数据结构和高质量的数据。
https://arxiv.org/abs/2303.12889
Cloud Robotics is helping to create a new generation of robots that leverage the nearly unlimited resources of large data centers (i.e., the cloud), overcoming the limitations imposed by on-board resources. Different processing power, capabilities, resource sizes, energy consumption, and so forth, make scheduling and task allocation critical components. The basic idea of task allocation and scheduling is to optimize performance by minimizing completion time, energy consumption, delays between two consecutive tasks, along with others, and maximizing resource utilization, number of completed tasks in a given time interval, and suchlike. In the past, several works have addressed various aspects of task allocation and scheduling. In this paper, we provide a comprehensive overview of task allocation and scheduling strategies and related metrics suitable for robotic network cloud systems. We discuss the issues related to allocation and scheduling methods and the limitations that need to be overcome. The literature review is organized according to three different viewpoints: Architectures and Applications, Methods and Parameters. In addition, the limitations of each method are highlighted for future research.
Cloud Robotics正在帮助创造利用大型数据中心(即云)几乎无限的资源的新机器人,克服内置资源的限制。不同的处理能力、能力、资源大小、能源消耗和其他方面,使调度和任务分配成为关键组件。任务分配和调度的基本思想是最大限度地优化性能,最小化完成任务的时间、能源消耗、延迟,与其他任务一起,最大化资源利用率、给定时间区间内完成任务的数量和类似方面。过去,多个工作已涉及任务分配和调度的各种方面。在本文中,我们提供对适合机器人网络云系统的任务分配和调度策略和相关指标的全面概述。我们讨论与分配和调度方法相关的各种问题和需要克服的局限性。文献综述按照三个不同的视角组织:结构和应用程序、方法和参数。此外,每种方法的局限性在此突出强调,为未来研究。
https://arxiv.org/abs/2303.12876
This article offers a literature review of goalkeeper robots in the context of the RoboCupSoccer competition. The latter is one of the various league categories hosted by the RoboCup Federation, which fosters AI and Robotics with their landmark challenges. Despite the number of articles on the subject of the goalkeeper, there is a lack of studies offering a comprehensive and up-to-date analysis. We propose to provide a review of research related to goalkeepers within the RoboCupSoccer leagues in order to extract possible improvements and scientific issues. The goalkeeper, although being a specific player, has many skills in common with other players. Therefore, this review is divided into three parts: perception, cognition and action, where the perception and action parts are common to all players and the cognition part focuses on goalkeepers. The discussion will open up on the possible improvements of the developments made for these goalkeepers.
本文在RoboCup足球比赛中守门员机器人的背景下进行了文献综述。RoboCup Federation主办的RoboCup联赛是培养人工智能和机器人的重要平台。尽管有关守门员的文章数量不少,但缺乏全面、最新的分析。我们建议对与守门员相关的研究进行综述,以提取可能的进步和科学问题。虽然守门员是特定的球员,但他们与其他球员有许多共同的技能。因此,本综述将分为三个部分:感知、认知和行动,感知和行动部分是所有球员共同的,认知部分重点考虑守门员。讨论将围绕为这些守门员开发的进步的可能性进行展开。
https://arxiv.org/abs/2303.12635
In the most extensive robot evolution systems, both the bodies and the brains of the robots undergo evolution and the brains of 'infant' robots are also optimized by a learning process immediately after 'birth'. This paper is concerned with the brain evolution mechanism in such a system. In particular, we compare four options obtained by combining asexual or sexual brain reproduction with Darwinian or Lamarckian evolution mechanisms. We conduct experiments in simulation with a system of evolvable modular robots on two different tasks. The results show that sexual reproduction of the robots' brains is preferable in the Darwinian framework, but the effect is the opposite in the Lamarckian system (both using the same infant learning method). Our experiments suggest that the overall best option is asexual reproduction combined with the Lamarckian framework, as it obtains better robots in terms of fitness than the other three. Considering the evolved morphologies, the different brain reproduction methods do not lead to differences. This result indicates that the morphology of the robot is mainly determined by the task and the environment, not by the brain reproduction methods.
在机器人进化系统中,机器人的身体和大脑都经历了进化,而婴儿机器人的大脑也在出生后通过一种学习过程得到优化。本文关注这种系统中的大脑进化机制。特别是,我们比较了通过结合无性或性脑繁殖与达尔文或拉马克进化机制的四个选项。我们通过模拟一种可进化模块机器人系统,在不同任务上进行了实验。实验结果表明,机器人大脑的性繁殖在达尔文框架中更为可取,但在拉马克框架中则相反(同时使用相同的婴儿学习方法)。我们的实验建议,整体最优的选择是无性繁殖与拉马克框架的组合,因为它在 fitness 方面获得比其他三个机器人更好的机器人。考虑到进化的形态,不同脑繁殖方法不会带来差异。这一结果表明,机器人的形态主要受到任务和环境的决定,而不是脑繁殖方法。
https://arxiv.org/abs/2303.12594