We investigated the performance of existing semi- and fully autonomous methods for controlling flipper-based skid-steer robots. Our study involves reimplementation of these methods for fair comparison and it introduces a novel semi-autonomous control policy that provides a compelling trade-off among current state-of-the-art approaches. We also propose new metrics for assessing cognitive load and traversal quality and offer a benchmarking interface for generating Quality-Load graphs from recorded data. Our results, presented in a 2D Quality-Load space, demonstrate that the new control policy effectively bridges the gap between autonomous and manual control methods. Additionally, we reveal a surprising fact that fully manual, continuous control of all six degrees of freedom remains highly effective when performed by an experienced operator on a well-designed analog controller from third person view.
我们研究了现有基于鳍片的履带转向机器人半自主和全自主控制方法的性能。我们的研究包括重新实现这些方法以进行公平比较,并提出了一种新的半自主控制策略,该策略在当前最先进的方法之间提供了一个引人注目的权衡。此外,我们还提出了用于评估认知负荷和穿越质量的新指标,并提供了从记录数据生成Quality-Load图的基准测试界面。我们的结果在二维的质量-负载空间中显示,新提出的控制策略有效地弥合了自主控制与手动控制方法之间的差距。另外,我们发现了一个令人惊讶的事实:当经验丰富的操作员使用来自第三方视点的精心设计的模拟控制器进行全手动、连续控制所有六个自由度时,这种方法仍然非常有效。
https://arxiv.org/abs/2503.14389
Obstacle avoidance for unmanned aerial vehicles like quadrotors is a popular research topic. Most existing research focuses only on static environments, and obstacle avoidance in environments with multiple dynamic obstacles remains challenging. This paper proposes a novel deep-reinforcement learning-based approach for the quadrotors to navigate through highly dynamic environments. We propose a lidar data encoder to extract obstacle information from the massive point cloud data from the lidar. Multi frames of historical scans will be compressed into a 2-dimension obstacle map while maintaining the obstacle features required. An end-to-end deep neural network is trained to extract the kinematics of dynamic and static obstacles from the obstacle map, and it will generate acceleration commands to the quadrotor to control it to avoid these obstacles. Our approach contains perception and navigating functions in a single neural network, which can change from a navigating state into a hovering state without mode switching. We also present simulations and real-world experiments to show the effectiveness of our approach while navigating in highly dynamic cluttered environments.
无人机(如四旋翼飞行器)的避障是当前研究的一个热门话题。大多数现有的研究仅关注静态环境下的避障问题,而在含有多个动态障碍物的环境中进行避障仍然是一项挑战性任务。本文提出了一种基于深度强化学习的方法,旨在帮助四旋翼飞行器在高度动态的环境中导航。 我们设计了一个激光雷达数据编码器来从大量点云数据中提取障碍物信息。多帧历史扫描将被压缩成一个二维障碍地图,同时保留所需的障碍特征。通过端到端的深度神经网络训练,可以从该障碍图中提取静态和动态障碍物的动力学特性,并生成加速度指令以控制四旋翼飞行器避开这些障碍物。 我们的方法在一个单一的神经网络中结合了感知和导航功能,无需模式切换即可从导航状态转换为悬停状态。此外,我们还通过仿真和实际实验展示了该方法在高度动态且复杂的环境中导航时的有效性。
https://arxiv.org/abs/2503.14352
We address prehensile pushing, the problem of manipulating a grasped object by pushing against the environment. Our solution is an efficient nonlinear trajectory optimization problem relaxed from an exact mixed integer non-linear trajectory optimization formulation. The critical insight is recasting the external pushers (environment) as a discrete probability distribution instead of binary variables and minimizing the entropy of the distribution. The probabilistic reformulation allows all pushers to be used simultaneously, but at the optimum, the probability mass concentrates onto one due to the entropy minimization. We numerically compare our method against a state-of-the-art sampling-based baseline on a prehensile pushing task. The results demonstrate that our method finds trajectories 8 times faster and at a 20 times lower cost than the baseline. Finally, we demonstrate that a simulated and real Franka Panda robot can successfully manipulate different objects following the trajectories proposed by our method. Supplementary materials are available at this https URL.
我们研究了灵巧推压问题,即通过推动环境来操控抓取物体的方法。我们的解决方案是从一个精确的混合整数非线性轨迹优化公式中放松得到的一个高效非线性轨迹优化问题。关键见解是将外部推力器(环境)重新表述为离散概率分布而不是二进制变量,并最小化该分布的熵。这种概率重述使得所有推力器都可以同时使用,但由于最小化了熵,在最优解处概率质量会集中到某一个推力器上。 我们通过数值实验在一项灵巧推压任务中将我们的方法与最先进的基于采样的基准方法进行了比较。结果表明,相比基准方法,我们的方法寻找轨迹的速度快8倍且成本低20倍。 最后,我们展示了模拟和真实的Franka Panda机器人可以根据我们提出的方法生成的路径成功操控不同的物体。补充材料可在此 URL 下载(请将"this https URL"替换为实际链接)。
https://arxiv.org/abs/2503.14268
Current transformer-based imitation learning approaches introduce discrete action representations and train an autoregressive transformer decoder on the resulting latent code. However, the initial quantization breaks the continuous structure of the action space thereby limiting the capabilities of the generative model. We propose a quantization-free method instead that leverages Generative Infinite-Vocabulary Transformers (GIVT) as a direct, continuous policy parametrization for autoregressive transformers. This simplifies the imitation learning pipeline while achieving state-of-the-art performance on a variety of popular simulated robotics tasks. We enhance our policy roll-outs by carefully studying sampling algorithms, further improving the results.
基于电流变压器的模仿学习方法引入了离散的动作表示,并在由此产生的潜在代码上训练自回归变换器解码器。然而,初始量化破坏了动作空间的连续结构,从而限制了生成模型的能力。我们提出了一种无量化的替代方法,该方法利用生成无限词汇量变换器(GIVT)作为自回归变换器的直接、连续策略参数化方式。这简化了模仿学习流程,并在多种流行的模拟机器人任务中实现了最先进的性能。通过仔细研究采样算法来增强我们的策略执行,进一步提高了结果质量。
https://arxiv.org/abs/2503.14259
This paper introduces a chain-driven, sandwich-legged, mid-size quadruped robot designed as an accessible research platform. The design prioritizes enhanced locomotion capabilities, improved reliability and safety of the actuation system, and simplified, cost-effective manufacturing processes. Locomotion performance is optimized through a sandwiched leg design and a dual-motor configuration, reducing leg inertia for agile movements. Reliability and safety are achieved by integrating robust cable strain reliefs, efficient heat sinks for motor thermal management, and mechanical limits to restrict leg motion. Simplified design considerations include a quasi-direct drive (QDD) actuator and the adoption of low-cost fabrication techniques, such as laser cutting and 3D printing, to minimize cost and ensure rapid prototyping. The robot weighs approximately 25 kg and is developed at a cost under \$8000, making it a scalable and affordable solution for robotics research. Experimental validations demonstrate the platform's capability to execute trot and crawl gaits on flat terrain and slopes, highlighting its potential as a versatile and reliable quadruped research platform.
本文介绍了一种链驱动、三明治腿设计的中型四足机器人,旨在成为一个易于研究的平台。该设计优先考虑了增强行走能力、改善执行系统的可靠性和安全性以及简化和成本效益制造流程。通过采用三明治式腿部设计和双电机配置,优化了移动性能,减少了腿部惯性,使其能够进行敏捷运动。为了提高可靠性和安全性,设计集成了坚固的电缆拉伸缓解装置、高效的散热器用于电机热管理以及机械限位以限制腿部动作。简化的设计考虑包括准直接驱动(QDD)执行器和采用低成本制造技术,如激光切割和3D打印,以降低成本并确保快速原型制作。该机器人重约25公斤,并且开发成本低于8000美元,使其成为一个可扩展且经济实惠的机器人研究解决方案。实验验证表明,该平台能够在平坦地面和斜坡上执行慢跑步态和爬行步态,展示了其作为多功能可靠的四足研究平台的巨大潜力。
https://arxiv.org/abs/2503.14255
Autonomous large-scale machine operations require fast, efficient, and collision-free motion planning while addressing unique challenges such as hydraulic actuation limits and underactuated joint dynamics. This paper presents a novel two-step motion planning framework designed for an underactuated forestry crane. The first step employs GPU-accelerated stochastic optimization to rapidly compute a globally shortest collision-free path. The second step refines this path into a dynamically feasible trajectory using a trajectory optimizer that ensures compliance with system dynamics and actuation constraints. The proposed approach is benchmarked against conventional techniques, including RRT-based methods and purely optimization-based approaches. Simulation results demonstrate substantial improvements in computation speed and motion feasibility, making this method highly suitable for complex crane systems.
自主大规模机器操作需要快速、高效且无碰撞的运动规划,同时要解决诸如液压驱动限制和欠驱动关节动力学等独特挑战。本文提出了一种新颖的两步运动规划框架,专门针对欠驱动林业起重机设计。第一步骤利用GPU加速随机优化技术来迅速计算一条全局最短且无碰撞路径。第二步骤则通过轨迹优化器将该路径细化为符合系统动态特性和执行机构约束的可行轨迹。 所提出的这种方法与传统的基于RRT(Rapidly-exploring Random Tree)的方法和纯优化方法进行了基准比较,模拟结果显示在计算速度和运动可行性方面有了显著改进,使得此方法非常适合复杂的起重机系统。
https://arxiv.org/abs/2503.14160
In this paper, the safety-critical control problem for uncertain systems under multiple control barrier function (CBF) constraints and input constraints is investigated. A novel framework is proposed to generate a safety filter that minimizes changes to reference inputs when safety risks arise, ensuring a balance between safety and performance. A nonlinear disturbance observer (DOB) based on the robust integral of the sign of the error (RISE) is used to estimate system uncertainties, ensuring that the estimation error converges to zero exponentially. This error bound is integrated into the safety-critical controller to reduce conservativeness while ensuring safety. To further address the challenges arising from multiple CBF and input constraints, a novel Volume CBF (VCBF) is proposed by analyzing the feasible space of the quadratic programming (QP) problem. % ensuring solution feasibility by keeping the volume as a positive value. To ensure that the feasible space does not vanish under disturbances, a DOB-VCBF-based method is introduced, ensuring system safety while maintaining the feasibility of the resulting QP. Subsequently, several groups of simulation and experimental results are provided to validate the effectiveness of the proposed controller.
在这篇论文中,研究了在存在多个控制屏障函数(CBF)约束和输入约束的情况下,不确定系统中的安全关键性控制问题。提出了一种新框架,用于生成当出现安全风险时最小化对参考输入变化的安全过滤器,确保安全性与性能之间的平衡。使用基于鲁棒误差符号积分(RISE)的非线性扰动观测器(DOB),来估计系统的不确定性,并确保估算误差以指数方式收敛于零。将此误差界限整合到安全关键控制器中,从而在保证安全性的前提下减少保守程度。为了进一步解决由多个CBF和输入约束带来的挑战,通过分析二次规划(QP)问题的可行空间,提出了一个新的体积控制屏障函数(VCBF)。% 通过保持正值来确保解的可行性。为防止干扰导致的可行空间消失,引入了一种基于DOB-VCBF的方法,在保证系统安全的同时维护所得到QP的可行性。随后,提供了几组仿真和实验结果以验证所提出控制器的有效性。
https://arxiv.org/abs/2503.13996
The capability of effectively moving on complex terrains such as sand and gravel can empower our robots to robustly operate in outdoor environments, and assist with critical tasks such as environment monitoring, search-and-rescue, and supply delivery. Inspired by the Mount Lyell salamander's ability to curl its body into a loop and effectively roll down {\Revision hill slopes}, in this study we develop a sand-rolling robot and investigate how its locomotion performance is governed by the shape of its body. We experimentally tested three different body shapes: Hexagon, Quadrilateral, and Triangle. We found that Hexagon and Triangle can achieve a faster rolling speed on sand, but exhibited more frequent failures of getting stuck. Analysis of the interaction between robot and sand revealed the failure mechanism: the deformation of the sand produced a local ``sand incline'' underneath robot contact segments, increasing the effective region of supporting polygon (ERSP) and preventing the robot from shifting its center of mass (CoM) outside the ERSP to produce sustainable rolling. Based on this mechanism, a highly-simplified model successfully captured the critical body pitch for each rolling shape to produce sustained rolling on sand, and informed design adaptations that mitigated the locomotion failures and improved robot speed by more than 200$\%$. Our results provide insights into how locomotors can utilize different morphological features to achieve robust rolling motion across deformable substrates.
在复杂地形如沙地和碎石上高效移动的能力可以增强机器人的野外操作能力,并帮助完成环境监测、搜寻救援以及物资运输等关键任务。受到利尔山火蜥蜴通过蜷曲身体成环状并有效滚动下坡的启发,我们开发了一种能够在沙地上滚动的机器人,并研究了其运动性能如何受其体型形状的影响。我们在实验中测试了三种不同的体形:六边形、四边形和三角形。 我们的发现显示,六边形和三角形可以在沙地实现更快的滚动速度,但同时也表现出更多的卡住现象。对机器人与沙子之间相互作用的分析揭示了这种失败机制:沙地变形在机器人接触部分下面形成了局部“沙坡”,增加了有效支撑多边形(ERSP)的面积,并阻止了机器人的重心(CoM)移出ERSP,从而阻碍可持续滚动。 基于这一机制,我们建立了一个高度简化的模型,成功捕捉到了每个滚动形状产生持续滚动的关键身体倾斜角度,并提出设计改进措施以缓解运动故障并提升机器人速度超过200%。我们的研究结果提供了有关如何利用不同的形态特征来实现跨可变形基底的稳定滚动运动的见解。 通过这一工作,我们不仅展示了自然界中生物适应策略对工程设计的实际应用价值,还为未来开发更多能够应对复杂自然环境挑战的自主移动机器提供了一条可行路径。
https://arxiv.org/abs/2503.13919
With this paper, the design of a biomimetic robotic squid (dubbed URSULA) developed for dexterous underwater manipulation is presented. The robot serves as a test bed for several novel underwater technologies such as soft manipulators, propeller-less propulsion, model mediated tele-operation with video and haptic feedback, sonar-based underwater mapping, localization, and navigation, and high bandwidth visible light communications. Following the finalization of the detailed design, a prototype is manufactured and is currently undergoing pool tests.
本文介绍了为水下灵巧操作设计的仿生机器人鱿鱼(代号URSULA)的设计。该机器人作为多种新颖水下技术的测试平台,例如软体机械手、无推进器推进系统、利用视频和触觉反馈的模型介导远程操作、基于声纳的水下地图绘制、定位与导航以及高带宽可见光通信等技术。完成详细设计后,制造了原型机,并正在进行游泳池测试。
https://arxiv.org/abs/2503.13913
The engineering community currently encounters significant challenges in the systematic development and validation of autonomy algorithms for off-road ground vehicles. These challenges are posed by unusually high test parameters and algorithmic variants. In order to address these pain points, this work presents an optimized digital engineering framework that tightly couples digital twin simulations with model-based systems engineering (MBSE) and model-based design (MBD) workflows. The efficacy of the proposed framework is demonstrated through an end-to-end case study of an autonomous light tactical vehicle (LTV) performing visual servoing to drive along a dirt road and reacting to any obstacles or environmental changes. The presented methodology allows for traceable requirements engineering, efficient variant management, granular parameter sweep setup, systematic test-case definition, and automated execution of the simulations. The candidate off-road autonomy algorithm is evaluated for satisfying requirements against a battery of 128 test cases, which is procedurally generated based on the test parameters (times of the day and weather conditions) and algorithmic variants (perception, planning, and control sub-systems). Finally, the test results and key performance indicators are logged, and the test report is generated automatically. This then allows for manual as well as automated data analysis with traceability and tractability across the digital thread.
当前的工程界在开发和验证越野地面车辆自主算法方面遇到了重大挑战,这些挑战主要由异常高的测试参数和算法变体引起。为了应对这些问题,本工作提出了一种优化的数字工程技术框架,该框架紧密耦合了数字孪生仿真与基于模型的系统工程(MBSE)和基于模型的设计(MBD)的工作流程。通过一个完整的案例研究来展示所提出的框架的有效性:一辆自主轻型战术车辆(LTV)执行视觉伺服操作沿土路行驶,并对任何障碍物或环境变化作出反应。该方法支持可追溯的需求工程、高效的变体管理、详细的参数扫描设置、系统化的测试用例定义以及模拟的自动化执行。 候选越野自主算法通过一系列128个测试案例进行了评估,这些测试案例根据测试参数(一天中的时间及天气条件)和算法变体(感知、规划与控制子系统)程序化生成。最终,测试结果及关键性能指标被记录下来,并自动生成测试报告,从而为手动以及自动数据分析提供可追溯性和透明度贯穿整个数字线程。
https://arxiv.org/abs/2503.13787
Galloping is a common high-speed gait in both animals and quadrupedal robots, yet its energetic characteristics remain insufficiently explored. This study systematically analyzes a large number of possible galloping gaits by categorizing them based on the number of flight phases per stride and the phase relationships between the front and rear legs, following Hildebrand's framework for asymmetrical gaits. Using the A1 quadrupedal robot from Unitree, we model galloping dynamics as a hybrid dynamical system and employ trajectory optimization (TO) to minimize the cost of transport (CoT) across a range of speeds. Our results reveal that rotary and transverse gallop footfall sequences exhibit no fundamental energetic difference, despite variations in body yaw and roll motion. However, the number of flight phases significantly impacts energy efficiency: galloping with no flight phases is optimal at lower speeds, whereas galloping with two flight phases minimizes energy consumption at higher speeds. We validate these findings using a quadratic programming (QP)-based controller, developed in our previous work, in Gazebo simulations. These insights advance the understanding of quadrupedal locomotion energetics and may inform future legged robot designs for adaptive, energy-efficient gait transitions.
跳跃奔跑是一种在动物和四足机器人中常见的高速步态,但其能量特性尚未充分研究。本研究系统地分析了大量可能的跳跃跑步步态,并根据每一步中的飞行阶段数量以及前后腿之间的相位关系进行分类,遵循Hildebrand提出的不对称步态框架。使用Unitree公司的A1四足机器人,我们将跳跃奔跑的动力学建模为混合动力系统,并采用轨迹优化(TO)方法,在一系列速度范围内最小化运输成本(CoT)。我们的研究结果表明,旋转式和横向跳跃步伐序列在能量方面没有根本性差异,尽管它们的身体侧倾和滚动运动有所不同。然而,飞行阶段的数量对能量效率有显著影响:无飞行阶段的跳跃奔跑在较低速度下是最优的选择,而具有两个飞行阶段的跳跃奔跑则能在较高速度下降低能耗。我们使用Gazebo仿真验证了这些发现,并通过我们在先前工作中开发的一种基于二次规划(QP)的方法进行了控制器设计。这些见解加深了对四足步态能量学的理解,并可能为未来腿部机器人适应性、节能的步态转换设计提供信息。
https://arxiv.org/abs/2503.13716
Many applications in robotics require primitive spherical geometry, especially in cases where efficient distance queries are necessary. Manual creation of spherical models is time-consuming and prone to errors. This paper presents Foam, a tool to generate spherical approximations of robot geometry from an input Universal Robot Description Format (URDF) file. Foam provides a robust preprocessing pipeline to handle mesh defects and a number of configuration parameters to control the level and approximation of the spherization, and generates an output URDF with collision geometry specified only by spheres. We demonstrate Foam on a number of standard robot models on common tasks, and demonstrate improved collision checking and distance query performance with only a minor loss in fidelity compared to the true collision geometry. We release our tool as an open source Python library and containerized command-line application to facilitate adoption across the robotics community.
许多机器人应用需要基础的球面几何学,特别是在需要高效距离查询的情况下。手动创建球形模型既耗时又容易出错。本文介绍了一种名为Foam的工具,它可以从输入的通用机器人描述格式(URDF)文件中生成机器人的球形近似模型。Foam提供了一个强大的预处理流水线来处理网格缺陷,并提供了多种配置参数以控制球化过程的程度和精度,同时生成一个仅用球体指定碰撞几何形状的输出URDF文件。我们在一系列标准机器人模型上展示了Foam的应用效果,并在常见的任务中证明了其能显著提高碰撞检测和距离查询性能,而与真实碰撞几何相比仅有微小的准确性损失。 我们以开源Python库和容器化命令行应用的形式发布该工具,以便在整个机器人技术社区内广泛采用。
https://arxiv.org/abs/2503.13704
Modular robotics enables the development of versatile and adaptive robotic systems with autonomous reconfiguration. This paper presents a modular robotic system in which each module has independent actuation, battery power, and control, allowing both individual mobility and coordinated locomotion. A hierarchical Central Pattern Generator (CPG) framework governs motion, with a low-level CPG controlling individual modules and a high-level CPG synchronizing inter-module coordination, enabling smooth transitions between independent and collective behaviors. To validate the system, we conduct simulations in MuJoCo and hardware experiments, evaluating locomotion across different configurations. We first analyze single-module motion, followed by two-module cooperative locomotion. Results demonstrate the effectiveness of the CPG-based control framework in achieving robust, flexible, and scalable locomotion. The proposed modular architecture has potential applications in search and rescue, environmental monitoring, and autonomous exploration, where adaptability and reconfigurability are essential.
模块化机器人技术使得开发多功能和适应性强的机器人系统成为可能,这些系统能够自主重组。本文介绍了一种模块化机器人系统,在该系统中每个模块都具有独立的驱动装置、电池电源以及控制系统,从而实现了单个移动与协调运动。一个分层的中央模式生成器(CPG)框架管理着机器人的动作:低级CPG控制单独的模块,而高级CPG则负责模块间的协调同步,从而使机器人能够在个体行为和集体行为之间平滑过渡。 为了验证该系统,我们在MuJoCo中进行了模拟实验,并通过硬件实验对其进行了评估。在不同的配置下,我们评估了机器人的运动能力。首先分析单一模块的独立动作,然后是两个模块之间的合作移动。结果表明基于CPG的控制框架能够有效地实现稳健、灵活且可扩展的运动。 所提出的模块化架构具有潜在的应用前景,尤其适用于搜索和救援、环境监测以及自主探索等领域,因为这些领域中适应性和重组能力至关重要。
https://arxiv.org/abs/2503.13674
Field-based reactive control provides a minimalist, decentralized route to guiding robots that lack onboard computation. Such schemes are well suited to resource-limited machines like microrobots, yet implementation artifacts, limited behaviors, and the frequent lack of formal guarantees blunt adoption. Here, we address these challenges with a new geometric approach called artificial spacetimes. We show that reactive robots navigating control fields obey the same dynamics as light rays in general relativity. This surprising connection allows us to adopt techniques from relativity and optics for constructing and analyzing control fields. When implemented, artificial spacetimes guide robots around structured environments, simultaneously avoiding boundaries and executing tasks like rallying or sorting, even when the field itself is static. We augment these capabilities with formal tools for analyzing what robots will do and provide experimental validation with silicon-based microrobots. Combined, this work provides a new framework for generating composed robot behaviors with minimal overhead.
基于场的反应控制为缺乏机载计算能力的机器人提供了一种简约且分散化的引导途径。此类方案非常适合资源有限的机器,如微小机器人,然而实现过程中的技术限制、行为受限以及缺少正式保证等因素阻碍了其广泛应用。本文提出一种名为人工时空的新几何方法来解决这些挑战。我们发现,在控制场中导航的反应式机器人遵循广义相对论中光线的动力学规律。这一意外联系使我们可以采用来自相对论和光学的技术,用于构造和分析控制场。实际应用时,人工时空能够指导机器人在结构化的环境中移动,同时避开边界并完成任务(如集结或分类),即使控制场本身是静态的也不例外。我们还加入了正式工具来分析机器人的行为,并通过硅基微小机器人进行了实验验证。综合来看,这项工作提供了一个新的框架,用于生成具有最小开销的组合式机器人行为。
https://arxiv.org/abs/2503.13355
Radar has become an essential sensor for autonomous navigation, especially in challenging environments where camera and LiDAR sensors fail. 4D single-chip millimeter-wave radar systems, in particular, have drawn increasing attention thanks to their ability to provide spatial and Doppler information with low hardware cost and power consumption. However, most single-chip radar systems using traditional signal processing, such as Fast Fourier Transform, suffer from limited spatial resolution in radar detection, significantly limiting the performance of radar-based odometry and Simultaneous Localization and Mapping (SLAM) systems. In this paper, we develop a novel radar signal processing pipeline that integrates spatial domain beamforming techniques, and extend it to 3D Direction of Arrival estimation. Experiments using public datasets are conducted to evaluate and compare the performance of our proposed signal processing pipeline against traditional methodologies. These tests specifically focus on assessing structural precision across diverse scenes and measuring odometry accuracy in different radar odometry systems. This research demonstrates the feasibility of achieving more accurate radar odometry by simply replacing the standard FFT-based processing with the proposed pipeline. The codes are available at GitHub*.
雷达已成为自主导航中不可或缺的传感器,特别是在相机和激光雷达传感器失效的复杂环境中。4D单芯片毫米波雷达系统因其能以低成本和低功耗提供空间及多普勒信息而备受关注。然而,大多数采用传统信号处理方法(如快速傅里叶变换)的单芯片雷达系统在雷达检测中存在空间分辨率受限的问题,这极大地限制了基于雷达的姿态估计(Odometry)和即时定位与地图构建(SLAM)系统的性能。 本文开发了一种新的雷达信号处理流程,该流程集成了空域波束形成技术,并将其扩展到三维到达方向的估计。我们使用公开数据集进行实验,以评估并比较我们的新信号处理方法与传统方法在不同场景下的结构精度以及各种雷达姿态估计系统中的测距精度。 本研究证明了通过用提议的管道替换标准快速傅里叶变换(FFT)处理可以实现更精确的雷达姿态估计。代码可在GitHub*上获取。
https://arxiv.org/abs/2503.13252
A promising effective human-robot interaction in assistive robotic systems is gaze-based control. However, current gaze-based assistive systems mainly help users with basic grasping actions, offering limited support. Moreover, the restricted intent recognition capability constrains the assistive system's ability to provide diverse assistance functions. In this paper, we propose an open implicit intention recognition framework powered by Large Language Model (LLM) and Vision Foundation Model (VFM), which can process gaze input and recognize user intents that are not confined to predefined or specific scenarios. Furthermore, we implement a gaze-driven LLM-enhanced assistive robot system (MindEye-OmniAssist) that recognizes user's intentions through gaze and assists in completing task. To achieve this, the system utilizes open vocabulary object detector, intention recognition network and LLM to infer their full intentions. By integrating eye movement feedback and LLM, it generates action sequences to assist the user in completing tasks. Real-world experiments have been conducted for assistive tasks, and the system achieved an overall success rate of 41/55 across various undefined tasks. Preliminary results show that the proposed method holds the potential to provide a more user-friendly human-computer interaction interface and significantly enhance the versatility and effectiveness of assistive systems by supporting more complex and diverse task.
在辅助机器人系统中,基于凝视的控制是一项有前景且有效的互动方式。然而,目前的基于凝视的辅助系统主要帮助用户执行基本抓握动作,提供的支持有限。此外,受限于意图识别能力,这些辅助系统难以提供多样化的协助功能。 为此,本文提出了一种由大型语言模型(LLM)和视觉基础模型(VFM)驱动的开放式隐式意图识别框架,该框架可以处理凝视输入并识别超出预定义或特定场景限制的用户意图。此外,我们还实现了一个基于凝视、增强型LLM辅助机器人系统(MindEye-OmniAssist),通过用户的凝视来识别其意图,并帮助完成任务。为了实现这一目标,系统利用开放词汇对象检测器、意图识别网络和LLM推断出完整的用户意图。结合眼球运动反馈和LLM,该系统生成动作序列以辅助用户完成任务。 在实际应用场景中进行了测试,涵盖各种未定义的任务,结果表明,该系统的整体成功率为41/55。初步结果显示,所提出的方法有望提供更为人性化的交互界面,并通过支持更复杂多样的任务显著提升辅助系统的灵活性和有效性。
https://arxiv.org/abs/2503.13250
The inertia tensor is an important parameter in many engineering fields, but measuring it can be cumbersome and involve multiple experiments or accurate and expensive equipment. We propose a method to measure the moment of inertia tensor of a rigid body from a single spinning throw, by attaching a small and inexpensive stand-alone measurement device consisting of a gyroscope, accelerometer and a reaction wheel. The method includes a compensation for the increase of moment of inertia due to adding the measurement device to the body, and additionally obtains the location of the centre of gravity of the body as an intermediate result. Experiments performed with known rigid bodies show that the mean accuracy is around 2\%.
惯性张量在许多工程领域中是一个重要的参数,但测量它往往繁琐且需要进行多次实验或使用精确而昂贵的设备。我们提出了一种方法,通过将一个由陀螺仪、加速度计和反应轮组成的便携式独立测量装置附加到刚体上,并从单次旋转投掷中即可测得该刚体的惯性张量。此方法包括对由于添加测量装置而导致的转动惯量增加进行补偿,并且作为中间结果,还可以获得刚体质心的位置。通过对已知刚体进行实验验证,我们发现平均精度约为2%。
https://arxiv.org/abs/2503.13137
Fully decentralized, safe, and deadlock-free multi-robot navigation in dynamic, cluttered environments is a critical challenge in robotics. Current methods require exact state measurements in order to enforce safety and liveness e.g. via control barrier functions (CBFs), which is challenging to achieve directly from onboard sensors like lidars and cameras. This work introduces LIVEPOINT, a decentralized control framework that synthesizes universal CBFs over point clouds to enable safe, deadlock-free real-time multi-robot navigation in dynamic, cluttered environments. Further, LIVEPOINT ensures minimally invasive deadlock avoidance behavior by dynamically adjusting agents' speeds based on a novel symmetric interaction metric. We validate our approach in simulation experiments across highly constrained multi-robot scenarios like doorways and intersections. Results demonstrate that LIVEPOINT achieves zero collisions or deadlocks and a 100% success rate in challenging settings compared to optimization-based baselines such as MPC and ORCA and neural methods such as MPNet, which fail in such environments. Despite prioritizing safety and liveness, LIVEPOINT is 35% smoother than baselines in the doorway environment, and maintains agility in constrained environments while still being safe and deadlock-free.
在动态且复杂的环境中,实现完全去中心化的、安全的以及无死锁的多机器人导航是一项关键挑战。当前的方法需要通过精确的状态测量(例如利用控制屏障函数(CBFs))来确保系统的安全性与活性,而直接从机载传感器如激光雷达和摄像头获取这种精确状态信息是非常具有挑战性的。这项工作引入了LIVEPOINT框架,这是一种去中心化的控制系统,它可以通过点云合成通用的CBFs,从而实现在动态且复杂的环境中进行安全且无死锁的实时多机器人导航。此外,LIVEPOINT通过基于一种新颖的对称交互度量来动态调整代理的速度,在避免死锁时保持最小程度的侵入性。 我们通过对包含高约束条件的多个机器人场景(例如门和交叉口)进行了模拟实验来验证我们的方法。结果显示,与基于优化的方法如MPC和ORCA及神经网络方法如MPNet相比,LIVEPOINT在具有挑战性的环境中实现了零碰撞或死锁以及100%的成功率,而后者在这种环境下会失败。尽管优先考虑安全性与活性,LIVEPOINT在门环境中的平滑度仍比基准高出35%,并且在受约束的环境中保持了灵活性的同时确保了安全性和无死锁状态。
https://arxiv.org/abs/2503.13098
Flexible electronic skins that simultaneously sense touch and bend are desired in several application areas, such as to cover articulated robot structures. This paper introduces a flexible tactile sensor based on Electrical Impedance Tomography (EIT), capable of simultaneously detecting and measuring contact forces and flexion of the sensor. The sensor integrates a magnetic hydrogel composite and utilizes EIT to reconstruct internal conductivity distributions. Real-time estimation is achieved through the one-step Gauss-Newton method, which dynamically updates reference voltages to accommodate sensor deformation. A convolutional neural network is employed to classify interactions, distinguishing between touch, bending, and idle states using pre-reconstructed images. Experimental results demonstrate an average touch localization error of 5.4 mm (SD 2.2 mm) and average bending angle estimation errors of 1.9$^\circ$ (SD 1.6$^\circ$). The proposed adaptive reference method effectively distinguishes between single- and multi-touch scenarios while compensating for deformation effects. This makes the sensor a promising solution for multimodal sensing in robotics and human-robot collaboration.
同时感知触觉和弯曲的柔性电子皮肤在多个应用领域中备受期待,例如覆盖可动机器人结构。本文介绍了一种基于电阻层析成像(EIT)的柔性触觉传感器,该传感器能够同时检测和测量接触力以及传感器的弯曲角度。此传感器整合了磁性水凝胶复合材料,并利用EIT重建内部导电分布情况。通过一步高斯-牛顿法实现实时估计,这种方法动态更新参考电压以适应传感器变形。卷积神经网络被用来分类各种交互状态,能够区分触摸、弯曲和空闲状态,并使用预先重建的图像进行识别。 实验结果显示,在触觉定位误差方面平均为5.4毫米(标准差2.2毫米),在弯曲角度估计误差上平均为1.9度(标准差1.6度)。提出的自适应参考方法能够有效区分单点触摸和多点触摸场景,并补偿变形效应。这使得该传感器成为机器人技术和人机协作中实现多模态感知的一个有前途的解决方案。
https://arxiv.org/abs/2503.13048
This paper presents a dual-channel tactile skin that integrates Electrical Impedance Tomography (EIT) with air pressure sensing to achieve accurate multi-contact force detection. The EIT layer provides spatial contact information, while the air pressure sensor delivers precise total force measurement. Our framework combines these complementary modalities through: deep learning-based EIT image reconstruction, contact area segmentation, and force allocation based on relative conductivity intensities from EIT. The experiments demonstrated 15.1% average force estimation error in single-contact scenarios and 20.1% in multi-contact scenarios without extensive calibration data requirements. This approach effectively addresses the challenge of simultaneously localizing and quantifying multiple contact forces without requiring complex external calibration setups, paving the way for practical and scalable soft robotic skin applications.
这篇论文提出了一种双通道触觉皮肤,该皮肤将电阻层析成像(EIT)与气压传感相结合,以实现准确的多点接触力检测。EIT 层提供空间接触信息,而气压传感器则提供精确的总力测量值。我们的框架通过以下方式结合这些互补模式:基于深度学习的 EIT 图像重建、接触区域分割和根据 EIT 的相对电导强度分配力。实验结果显示,在单点接触场景中平均力估计误差为 15.1%,在多点接触场景中的误差为 20.1%,并且无需大量校准数据即可实现这些效果。这种方法有效地解决了同时定位和量化多个接触力的难题,而无需复杂的外部校准装置,为实际且可扩展的软体机器人皮肤应用铺平了道路。
https://arxiv.org/abs/2503.13036