Abstract
Inverted landing is a routine behavior among a number of animal fliers. However, mastering this feat poses a considerable challenge for robotic fliers, especially to perform dynamic perching with rapid body rotations (or flips) and landing against gravity. Inverted landing in flies have suggested that optical flow senses are closely linked to the precise triggering and control of body flips that lead to a variety of successful landing behaviors. Building upon this knowledge, we aimed to replicate the flies' landing behaviors in small quadcopters by developing a control policy general to arbitrary ceiling-approach conditions. First, we employed reinforcement learning in simulation to optimize discrete sensory-motor pairs across a broad spectrum of ceiling-approach velocities and directions. Next, we converted the sensory-motor pairs to a two-stage control policy in a continuous augmented-optical flow space. The control policy consists of a first-stage Flip-Trigger Policy, which employs a one-class support vector machine, and a second-stage Flip-Action Policy, implemented as a feed-forward neural network. To transfer the inverted-landing policy to physical systems, we utilized domain randomization and system identification techniques for a zero-shot sim-to-real transfer. As a result, we successfully achieved a range of robust inverted-landing behaviors in small quadcopters, emulating those observed in flies.
Abstract (translated)
翻转着陆是许多飞行器飞行员的日常行为。然而,掌握这项技能对机器人飞行器来说仍然是一个相当大的挑战,尤其是在实现快速的身体旋转(或翻转)以进行动态着陆并且接地时。在苍蝇中观察到的翻转着陆表明,光流感知与精确触发和控制身体翻转从而导致各种成功的着陆行为密切相关。在此基础上,我们旨在通过开发一种通用的控制策略来复制苍蝇的着陆行为,该策略可以适用于广泛的上升速度和方向。首先,我们在仿真中使用强化学习来优化广阔的上升速度和方向范围内的离散感官-运动对。接下来,我们将感官-运动对转换为连续增强光学流空间中的两个阶段控制策略。控制策略包括一个第一阶段的翻转触发策略,它采用了一个分类支持向量机,和一个第二阶段的翻转动作策略,实现为前馈神经网络。为了将翻转着陆策略传递到物理系统,我们利用领域随机化和系统识别技术实现零击球模拟到实时的转移。结果,我们在小四旋翼上成功实现了各种鲁棒翻转着陆行为,并模拟了苍蝇中观察到的翻转着陆行为。
URL
https://arxiv.org/abs/2403.00128