The Unmanned Aerial Vehicles (UAVs) market has been significantly growing and Considering the availability of drones at low-cost prices the possibility of misusing them, for illegal purposes such as drug trafficking, spying, and terrorist attacks posing high risks to national security, is rising. Therefore, detecting and tracking unauthorized drones to prevent future attacks that threaten lives, facilities, and security, become a necessity. Drone detection can be performed using different sensors, while image-based detection is one of them due to the development of artificial intelligence techniques. However, knowing unauthorized drone types is one of the challenges due to the lack of drone types datasets. For that, in this paper, we provide a dataset of various drones as well as a comparison of recognized object detection models on the proposed dataset including YOLO algorithms with their different versions, like, v3, v4, and v5 along with the Detectronv2. The experimental results of different models are provided along with a description of each method. The collected dataset can be found in this https URL
无人机(UAVs)市场已经显著增长。考虑到低成本无人机(如货运无人机、农业无人机等)的可用性,非法用途(如毒品走私、间谍活动、恐怖主义袭击等)对国家安全的威胁越来越高,因此,检测和追踪未经授权的无人机以防止未来可能对生命、设施和安全的威胁,变得势在必行。 无人机检测可以通过各种传感器进行,而图像识别检测就是其中之一,因为人工智能技术的发展。然而,由于缺乏无人机类型的数据集,知道未经授权的无人机类型是一个挑战。 为了解决这个问题,本文提供了一个包括各种无人机的数据集,以及不同版本的YOLO算法,如v3、v4和v5,以及Detectronv2,并提供了各种模型的实验结果以及方法的描述。收集的數據可以通過此处的链接找到:https://www.example.com/
https://arxiv.org/abs/2405.10398
There are various desired capabilities to create aerial forest-traversing robots capable of monitoring both biological and abiotic data. The features range from multi-functionality, robustness, and adaptability. These robots have to weather turbulent winds and various obstacles such as forest flora and wildlife thus amplifying the complexity of operating in such uncertain environments. The key for successful data collection is the flexibility to intermittently move from tree-to-tree, in order to perch at vantage locations for elongated time. This effort to perch not only reduces the disturbance caused by multi-rotor systems during data collection, but also allows the system to rest and recharge for longer outdoor missions. Current systems feature the addition of perching modules that increase the aerial robots' weight and reduce the drone's overall endurance. Thus in our work, the key questions currently studied are: "How do we develop a single robot capable of metamorphosing its body for multi-modal flight and dynamic perching?", "How do we detect and land on perchable objects robustly and dynamically?", and "What important spatial-temporal data is important for us to collect?"
有许多创建能够监测生物和环境数据并穿越森林的 aerial 森林穿越机器人的期望功能。这些功能包括多功能性、稳健性和适应性。这些机器人必须应对动荡的风和各种障碍,例如森林植物和野生动物,从而增加了在如此不确定的环境中操作的复杂性。成功数据收集的关键在于可以在树木间间歇移动,以便在长时间的攀爬过程中停留在优势位置。这种在树上停留的努力不仅减少了多旋翼系统在数据收集过程中产生的干扰,而且还允许系统休息和充电以执行更长的户外任务。目前的系统增加了悬挂模块,增加了无人机的重量并降低了其耐用性。因此,在我们的工作中,当前研究的关键问题包括:“我们如何开发一个能够进行多模态飞行和动态悬挂的单个机器人?”、“我们如何动态和可靠地检测并登陆可攀爬的物体?”以及“对我们来说重要的是收集哪些重要的空间和时间数据?”
https://arxiv.org/abs/2405.10043
Monaural Speech enhancement on drones is challenging because the ego-noise from the rotating motors and propellers leads to extremely low signal-to-noise ratios at onboard microphones. Although recent masking-based deep neural network methods excel in monaural speech enhancement, they struggle in the challenging drone noise scenario. Furthermore, existing drone noise datasets are limited, causing models to overfit. Considering the harmonic nature of drone noise, this paper proposes a frequency domain bottleneck adapter to enable transfer learning. Specifically, the adapter's parameters are trained on drone noise while retaining the parameters of the pre-trained Frequency Recurrent Convolutional Recurrent Network (FRCRN) fixed. Evaluation results demonstrate the proposed method can effectively enhance speech quality. Moreover, it is a more efficient alternative to fine-tuning models for various drone types, which typically requires substantial computational resources.
在无人机上进行单声道语音增强是一个具有挑战性的任务,因为旋转电机和螺旋桨的自元噪声导致机载麦克风中的信号-噪声比非常低。尽管基于遮罩的深度神经网络方法在单声道语音增强方面表现出色,但在具有挑战性的无人机噪音场景中,它们的表现不佳。此外,现有的无人机噪音数据集有限,导致模型过拟合。考虑到无人机噪音的谐波特性,本文提出了一种频率域瓶颈适配器,以实现迁移学习。具体来说,适配器的参数在保留前预训练的FRCRN参数的同时,在无人机噪音上进行训练。评估结果表明,与单声道语音增强相比,所提出的方法可以有效增强语音质量。此外,它是为各种无人机类型进行模型微调的更有效选择,而通常需要大量的计算资源。
https://arxiv.org/abs/2405.10022
Quadrotors are widely employed across various domains, yet the conventional type faces limitations due to underactuation, where attitude control is closely tied to positional adjustments. In contrast, quadrotors equipped with tiltable rotors offer overactuation, empowering them to track both position and attitude trajectories. However, the nonlinear dynamics of the drone body and the sluggish response of tilting servos pose challenges for conventional cascade controllers. In this study, we propose a control methodology for tilting-rotor quadrotors based on nonlinear model predictive control (NMPC). Unlike conventional approaches, our method preserves the full dynamics without simplification and utilizes actuator commands directly as control inputs. Notably, we incorporate a first-order servo model within the NMPC framework. Through simulation, we observe that integrating the servo dynamics not only enhances control performance but also accelerates convergence. To assess the efficacy of our approach, we fabricate a tiltable-quadrotor and deploy the algorithm onboard at a frequency of 100Hz. Extensive real-world experiments demonstrate rapid, robust, and smooth pose tracking performance.
四旋翼广泛应用于各种领域,然而传统的四旋翼由于俯仰力不足,导致姿态控制与位置调整紧密耦合。相比之下,配备可倾斜旋翼的四旋翼提供了过冲量,使它们能够跟踪位置和姿态轨迹。然而,无人机身体和倾斜伺服器的非线性动力学以及缓慢的响应给传统的级联控制器带来了挑战。在这项研究中,我们提出了一种基于非线性模型预测控制(NMPC)的倾斜旋翼四旋翼控制方法。与传统方法不同,我们的方法在保留完整动态的同时避免了简化,并直接将执行器命令作为控制输入。值得注意的是,我们在NMPC框架内考虑了第一级伺服机模型。通过仿真,我们观察到,将伺服机动力纳入NMPC框架不仅提高了控制性能,而且加速了收敛。为了评估我们方法的效力,我们制造了一个可倾斜的四旋翼,并在100Hz的频率上将其部署在船上。大量现实世界的实验证明,快速、稳健和平滑的位置跟踪性能。
https://arxiv.org/abs/2405.09871
This study examines the role of visual highlights in guiding user attention in drone monitoring tasks, employing a simulated interface for observation. The experiment results show that such highlights can significantly expedite the visual attention on the corresponding area. Based on this observation, we leverage both the temporal and spatial information in the highlight to develop a new saliency model: the highlight-informed saliency model (HISM), to infer the visual attention change in the highlight condition. Our findings show the effectiveness of visual highlights in enhancing user attention and demonstrate the potential of incorporating these cues into saliency prediction models.
本研究探讨了在无人机监测任务中,视觉突显在引导用户注意力和识别目标中的作用。为进行观察,我们采用模拟界面进行实验。实验结果表明,这样的突显可以显著加快相应区域的视觉注意力的进程。根据这个观察结果,我们利用突显中的时间和空间信息来开发了一个新的显著性模型:突显通知显著性模型(HISM),以推断突显条件下的视觉注意力变化。我们的研究结果表明视觉突显在增强用户注意力和将这些线索纳入显著性预测模型方面具有有效性。
https://arxiv.org/abs/2405.09695
Aerial imagery is increasingly used in Earth science and natural resource management as a complement to labor-intensive ground-based surveys. Aerial systems can collect overlapping images that provide multiple views of each location from different perspectives. However, most prediction approaches (e.g. for tree species classification) use a single, synthesized top-down "orthomosaic" image as input that contains little to no information about the vertical aspects of objects and may include processing artifacts. We propose an alternate approach that generates predictions directly on the raw images and accurately maps these predictions into geospatial coordinates using semantic meshes. This method$\unicode{x2013}$released as a user-friendly open-source toolkit$\unicode{x2013}$enables analysts to use the highest quality data for predictions, capture information about the sides of objects, and leverage multiple viewpoints of each location for added robustness. We demonstrate the value of this approach on a new benchmark dataset of four forest sites in the western U.S. that consists of drone images, photogrammetry results, predicted tree locations, and species classification data derived from manual surveys. We show that our proposed multiview method improves classification accuracy from 53% to 75% relative to an orthomosaic baseline on a challenging cross-site tree species classification task.
无人机影像在地球科学和自然资源管理中作为劳动密集型地面调查的补充,越来越受到关注。无人机系统可以收集重叠的图像,从不同的角度提供每个地点的多个视图。然而,大多数预测方法(例如树木种类分类)使用单个合成顶部的“正射影像”作为输入,其中包含少量的关于物体垂直方面的信息,并可能包括处理伪影。我们提出了一种替代方法,直接在原始图像上生成预测,并使用语义网格将预测准确地映射到地理坐标中。这个用户友好、开源的工具包$\unicode{x2013}$的发布使得分析师可以使用最高质量的数据进行预测,捕获物体的一侧信息,并利用每个地点的多个视角来增加稳健性。我们在美国西部四个森林站的基准数据集上证明了这种方法的价值,该数据集包括无人机影像、地形测量结果、预测树木位置和来自手动调查的树木种类分类数据。我们显示,与正射影像基线相比,我们提出的多视角方法将分类准确性从53%提高到了75%。
https://arxiv.org/abs/2405.09544
In this paper, we present an innovative technique for the path planning of flying robots in a 3D environment in Rough Mereology terms. The main goal was to construct the algorithm that would generate the mereological potential fields in 3-dimensional space. To avoid falling into the local minimum, we assist with a weighted Euclidean distance. Moreover, a searching path from the start point to the target, with respect to avoiding the obstacles was applied. The environment was created by connecting two cameras working in real-time. To determine the gate and elements of the world inside the map was responsible the Python Library OpenCV [1] which recognized shapes and colors. The main purpose of this paper is to apply the given results to drones.
在本文中,我们提出了一种创新的方法,用于在 rough melee 环境下对飞行机器人的路径进行规划。主要目标是为 3D 空间中的飞行机器人生成只论域 potential fields。为了避免陷入局部最小值,我们使用加权欧氏距离来协助算法。此外,我们还应用了从起点到目标点的搜索路径,以避免障碍物。环境是由实时连接的两个相机创建的。确定地图内世界的门和元素的是 Python 库 OpenCV [1],它识别形状和颜色。本文的主要目的是将所得到的结果应用于无人机。
https://arxiv.org/abs/2405.09282
Object detection techniques for Unmanned Aerial Vehicles (UAVs) rely on Deep Neural Networks (DNNs), which are vulnerable to adversarial attacks. Nonetheless, adversarial patches generated by existing algorithms in the UAV domain pay very little attention to the naturalness of adversarial patches. Moreover, imposing constraints directly on adversarial patches makes it difficult to generate patches that appear natural to the human eye while ensuring a high attack success rate. We notice that patches are natural looking when their overall color is consistent with the environment. Therefore, we propose a new method named Environmental Matching Attack(EMA) to address the issue of optimizing the adversarial patch under the constraints of color. To the best of our knowledge, this paper is the first to consider natural patches in the domain of UAVs. The EMA method exploits strong prior knowledge of a pretrained stable diffusion to guide the optimization direction of the adversarial patch, where the text guidance can restrict the color of the patch. To better match the environment, the contrast and brightness of the patch are appropriately adjusted. Instead of optimizing the adversarial patch itself, we optimize an adversarial perturbation patch which initializes to zero so that the model can better trade off attacking performance and naturalness. Experiments conducted on the DroneVehicle and Carpk datasets have shown that our work can reach nearly the same attack performance in the digital attack(no greater than 2 in mAP$\%$), surpass the baseline method in the physical specific scenarios, and exhibit a significant advantage in terms of naturalness in visualization and color difference with the environment.
无人机(UAV)的目标检测技术依赖于深度神经网络(DNNs),这些网络对攻击非常敏感。然而,UAV领域现有算法生成的攻击补丁对攻击的自然性非常关注。此外,直接对攻击补丁施加约束会使得生成看起来自然的人工补丁变得困难,同时保证高攻击成功率。我们注意到,当补丁的整体颜色与环境相同时,它们看起来是自然的。因此,我们提出了一种名为环境匹配攻击(EMA)的新方法来解决在颜色约束下优化攻击补丁的问题。据我们所知,这是第一个考虑UAV领域自然补丁的论文。EMA方法利用预训练的稳定扩散的强烈先验知识引导攻击补丁的优化方向,其中文本指导可以限制补丁的颜色。为了更好地匹配环境,适当调整补丁的对比度和亮度。我们不是优化攻击补丁本身,而是优化一个攻击补丁,该补丁初始化为零,以便模型可以更好地平衡攻击性能和自然性。在DroneVehicle和Carpk数据集上进行的实验表明,我们的工作在数字攻击(MAP%不超过2)方面的攻击性能与基线方法相当,在物理特定场景中超过了基线方法,并且在可视化和颜色差异方面具有显著的优越性。
https://arxiv.org/abs/2405.07595
Berry picking has long-standing traditions in Finland, yet it is challenging and can potentially be dangerous. The integration of drones equipped with advanced imaging techniques represents a transformative leap forward, optimising harvests and promising sustainable practices. We propose WildBe, the first image dataset of wild berries captured in peatlands and under the canopy of Finnish forests using drones. Unlike previous and related datasets, WildBe includes new varieties of berries, such as bilberries, cloudberries, lingonberries, and crowberries, captured under severe light variations and in cluttered environments. WildBe features 3,516 images, including a total of 18,468 annotated bounding boxes. We carry out a comprehensive analysis of WildBe using six popular object detectors, assessing their effectiveness in berry detection across different forest regions and camera types. We will release WildBe publicly.
翻译:虽然芬兰有着悠久的采摘野果的传统,但是采摘野果具有挑战性,还可能存在危险。利用配备先进成像技术的无人机进行集成,代表着向前迈进了一步,优化了采摘成果并承诺了可持续的实践。我们提出了WildBe,第一个利用无人机在泥炭地和对芬兰森林树冠层采摘野果的图像数据集。与之前和相关数据集相比,WildBe包括在严重光变和杂乱环境中捕获的新品种野果,如越橘、云莓、酸果和野草莓。WildBe包含3,516张图像,包括总共18,468个标注的边界框。我们使用六个流行的物体检测器对WildBe进行全面分析,评估它们在不同森林区域和相机类型下的野果检测效果。我们将发布WildBe公开。
https://arxiv.org/abs/2405.07550
Consumer-grade drones have become effective multimedia collection tools, spring-boarded by rapid development in embedded CPUs, GPUs, and cameras. They are best known for their ability to cheaply collect high-quality aerial video, 3D terrain scans, infrared imagery, etc., with respect to manned aircraft. However, users can also create and attach custom sensors, actuators, or computers, so the drone can collect different data, generate composite data, or interact intelligently with its environment, e.g., autonomously changing behavior to land in a safe way, or choosing further data collection sites. Unfortunately, developing custom payloads is prohibitively difficult for many researchers outside of engineering. We provide guidelines for how to create a sophisticated computational payload that integrates a Raspberry Pi 5 into a DJI Matrice 350. The payload fits into the Matrice's case like a typical DJI payload (but is much cheaper), is easy to build and expand (3D-printed), uses the drone's power and telemetry, can control the drone and its other payloads, can access the drone's sensors and camera feeds, and can process video and stream it to the operator via the controller in real time. We describe the difficulties and proprietary quirks we encountered, how we worked through them, and provide setup scripts and a known-working configuration for others to use.
消费级无人机已经成为有效的多媒体收集工具,由嵌入式CPU、GPU和相机的快速发展而催生。它们最为人所知的是在成本较低的情况下收集高质量的高空视频、3D地形扫描、红外图像等,与载人飞机相比。然而,用户还可以创建并附加定制传感器、执行器或计算机,使无人机可以收集不同数据、生成合成数据,或者与周围环境智能交互,例如以安全方式自主着陆,或选择进一步的数据收集站点。 然而,对于许多非工程师领域的科研人员来说,开发定制负载是非常困难的。我们为如何将Raspberry Pi 5集成到DJI Matrice 350中创建一个复杂的计算负载提供了指导。该负载非常适合Matrice(但成本更低),易于设计和扩展(3D打印),使用无人机的电力和数据传输,可以控制无人机及其它负载,可以访问无人机的传感器和摄像头数据,并通过控制器实时处理视频并将其流式传输给操作员。 我们描述了我们在过程中遇到的困难和 proprietary quirks,以及我们是如何克服这些问题的。我们还提供了设置脚本和已知工作的配置,供其他人使用。
https://arxiv.org/abs/2405.06176
This article presents the world's first rapid drone flocking control using natural language through generative AI. The described approach enables the intuitive orchestration of a flock of any size to achieve the desired geometry. The key feature of the method is the development of a new interface based on Large Language Models to communicate with the user and to generate the target geometry descriptions. Users can interactively modify or provide comments during the construction of the flock geometry model. By combining flocking technology and defining the target surface using a signed distance function, smooth and adaptive movement of the drone swarm between target states is achieved. Our user study on FlockGPT confirmed a high level of intuitive control over drone flocking by users. Subjects who had never previously controlled a swarm of drones were able to construct complex figures in just a few iterations and were able to accurately distinguish the formed swarm drone figures. The results revealed a high recognition rate for six different geometric patterns generated through the LLM-based interface and performed by a simulated drone flock (mean of 80% with a maximum of 93\% for cube and tetrahedron patterns). Users commented on low temporal demand (19.2 score in NASA-TLX), high performance (26 score in NASA-TLX), attractiveness (1.94 UEQ score), and hedonic quality (1.81 UEQ score) of the developed system. The FlockGPT demo code repository can be found at: coming soon
本文介绍了使用自然语言通过生成式人工智能实现世界范围内第一个快速无人机集群控制的方法。描述的方法允许用户直观地编排任意大小的集群以达到所需的形状。该方法的关键特点是基于大型语言模型开发的新接口,用于与用户交互并生成目标形状描述。用户在集群几何模型构建过程中可以交互式修改或提供评论。通过结合无人机集群技术和使用带签名距离函数定义目标表面,实现了无人机集群在目标状态之间的平滑和自适应运动。我们对FlockGPT的用户研究证实了用户对无人机集群的直观控制程度很高。之前没有控制过无人机集群的受试者只用几步就能构建出复杂的形状,并且能够准确地区分形成的无人机集群形状。结果表明,基于LLM的接口生成的六种不同几何图案的识别率为80%到93%。用户对系统的时间需求低(NASA-TLX中的19.2分),性能高(26分),吸引力高(1.94 UEQ分数),审美观好(1.81 UEQ分数)发表了评论。FlockGPT的演示代码存储库可以在:即将发布。
https://arxiv.org/abs/2405.05872
This work presents a drone detector with modified backbone and multiple pyramid feature maps enhancement structure (MDDPE). Novel feature maps improve modules that uses different levels of information to produce more robust and discriminatory features is proposed. These module includes the feature maps supplement function and the feature maps recombination enhancement this http URL effectively handle the drone characteristics, auxiliary supervisions that are implemented in the early stages by employing tailored anchors designed are utilized. To further improve the modeling of real drone detection scenarios and initialization of the regressor, an updated anchor matching technique is introduced to match anchors and ground truth drone as closely as feasible. To show the proposed MDDPE's superiority over the most advanced detectors, extensive experiments are carried out using well-known drone detection benchmarks.
本文提出了一种名为MDDPE的多层金字塔特征图增强结构(无人机检测器)。新颖的特征图改进了使用不同信息水平产生更稳健和具有区分性的模块。这些模块包括特征图补充功能和特征图复合同样增强。通过采用定制锚定器,在无人机特征识别的早期阶段实现了辅助监督,从而有效地处理了无人机特性,辅助监督在无人机检测器中作为定制锚定器被利用。为了进一步改善对真实无人机检测场景的建模和初始化,引入了一种更新的锚定匹配技术,以尽可能地匹配锚点和地面真实无人机。为了展示所提出的MDDPE与最先进的检测器相比的优越性,使用著名的无人机检测基准进行了广泛的实验。
https://arxiv.org/abs/2405.02882
Trajectory prediction is a cornerstone in autonomous driving (AD), playing a critical role in enabling vehicles to navigate safely and efficiently in dynamic environments. To address this task, this paper presents a novel trajectory prediction model tailored for accuracy in the face of heterogeneous and uncertain traffic scenarios. At the heart of this model lies the Characterized Diffusion Module, an innovative module designed to simulate traffic scenarios with inherent uncertainty. This module enriches the predictive process by infusing it with detailed semantic information, thereby enhancing trajectory prediction accuracy. Complementing this, our Spatio-Temporal (ST) Interaction Module captures the nuanced effects of traffic scenarios on vehicle dynamics across both spatial and temporal dimensions with remarkable effectiveness. Demonstrated through exhaustive evaluations, our model sets a new standard in trajectory prediction, achieving state-of-the-art (SOTA) results on the Next Generation Simulation (NGSIM), Highway Drone (HighD), and Macao Connected Autonomous Driving (MoCAD) datasets across both short and extended temporal spans. This performance underscores the model's unparalleled adaptability and efficacy in navigating complex traffic scenarios, including highways, urban streets, and intersections.
轨迹预测是自动驾驶(AD)中的关键技术,在使车辆在动态环境中安全高效地导航方面发挥了重要作用。为解决这个问题,本文提出了一种专门针对异质和不确定交通场景的轨迹预测模型。这个模型的核心是基于特征扩散模块,这是一种创新的模块,旨在通过模拟固有不确定性的交通场景来提高预测准确性。通过向这个模型注入详细语义信息,从而增强了轨迹预测的准确性。此外,我们的空间-时间(ST)交互模块有效地捕捉了交通场景对车辆动力学的影响,在时间和空间维度上实现了对车辆动态的微小影响。通过详尽评估,我们的模型在轨迹预测方面达到了新的标准,在Next Generation Simulation(NGSIM)、高速公路无人机(HighD)和澳门 Connected Autonomous Driving(MoCAD)数据集上取得了最先进的(SOTA)结果。这种性能凸显了模型的无与伦比的适应性和有效性,使其能够应对复杂的交通场景,包括高速公路、城市街道和交叉口。
https://arxiv.org/abs/2405.02145
Rapid advancements of deep learning are accelerating adoption in a wide variety of applications, including safety-critical applications such as self-driving vehicles, drones, robots, and surveillance systems. These advancements include applying variations of sophisticated techniques that improve the performance of models. However, such models are not immune to adversarial manipulations, which can cause the system to misbehave and remain unnoticed by experts. The frequency of modifications to existing deep learning models necessitates thorough analysis to determine the impact on models' robustness. In this work, we present an experimental evaluation of the effects of model modifications on deep learning model robustness using adversarial attacks. Our methodology involves examining the robustness of variations of models against various adversarial attacks. By conducting our experiments, we aim to shed light on the critical issue of maintaining the reliability and safety of deep learning models in safety- and security-critical applications. Our results indicate the pressing demand for an in-depth assessment of the effects of model changes on the robustness of models.
深度学习的快速发展在各种应用中加速了其采用,包括自动驾驶车辆、无人机、机器人和监控系统等安全关键应用。这些进步包括应用复杂的技巧来提高模型的性能。然而,这些模型并非免受对抗性操纵的影响,这可能导致系统表现异常,并让专家无法察觉。对现有深度学习模型的修改频率表明,需要对模型的一致性进行深入分析,以确定其对模型鲁棒性的影响。在这项工作中,我们通过使用对抗攻击来评估模型修改对深度学习模型鲁棒性的影响。我们的方法包括研究模型修改对各种对抗攻击的鲁棒性。通过进行我们的实验,我们希望阐明在安全性和安全性关键应用中保持深度学习模型可靠性和安全性的迫切需求。我们的结果表明,对模型更改对模型鲁棒性的影响进行深入评估的需求非常紧迫。
https://arxiv.org/abs/2405.01934
Invasive plant species are detrimental to the ecology of both agricultural and wildland areas. Euphorbia esula, or leafy spurge, is one such plant that has spread through much of North America from Eastern Europe. When paired with contemporary computer vision systems, unmanned aerial vehicles, or drones, offer the means to track expansion of problem plants, such as leafy spurge, and improve chances of controlling these weeds. We gathered a dataset of leafy spurge presence and absence in grasslands of western Montana, USA, then surveyed these areas with a commercial drone. We trained image classifiers on these data, and our best performing model, a pre-trained DINOv2 vision transformer, identified leafy spurge with 0.84 accuracy (test set). This result indicates that classification of leafy spurge is tractable, but not solved. We release this unique dataset of labelled and unlabelled, aerial drone imagery for the machine learning community to explore. Improving classification performance of leafy spurge would benefit the fields of ecology, conservation, and remote sensing alike. Code and data are available at our website: this http URL.
入侵植物物种对农业和野生地区的生态系统都有害。Euphorbia esula(或称为叶面灌木)是一种已经从东欧扩散到北美大部分地区的植物。与当代计算机视觉系统、自主飞行器或无人机搭配,可以追踪问题植物(如叶面灌木)的扩散并提高控制这些杂草的机会。我们收集了美国怀俄明州西部草原中叶面灌木的有无数据,然后用商业无人机对其进行了调查。我们对这些数据进行训练,并训练了图像分类器。我们表现最好的模型——预训练的DINOv2视觉变压器,识别出叶面灌木的准确率为0.84(测试集)。这个结果表明,对叶面灌木进行分类是可行的,但尚未解决。我们向机器学习社区发布了这个带有标签和未标记、无人机影像的独一无二 dataset。提高叶面灌木分类性能将有益于生态学、保护和遥感领域。代码和数据可在我们的网站http://www.this URL上获得。
https://arxiv.org/abs/2405.03702
In autonomous and mobile robotics, a principal challenge is resilient real-time environmental perception, particularly in situations characterized by unknown and dynamic elements, as exemplified in the context of autonomous drone racing. This study introduces a perception technique for detecting drone racing gates under illumination variations, which is common during high-speed drone flights. The proposed technique relies upon a lightweight neural network backbone augmented with capabilities for continual learning. The envisaged approach amalgamates predictions of the gates' positional coordinates, distance, and orientation, encapsulating them into a cohesive pose tuple. A comprehensive number of tests serve to underscore the efficacy of this approach in confronting diverse and challenging scenarios, specifically those involving variable lighting conditions. The proposed methodology exhibits notable robustness in the face of illumination variations, thereby substantiating its effectiveness.
在自主和移动机器人领域,一个主要的挑战是具有弹性的实时环境感知,尤其是在具有未知和动态元素的背景下,例如在自主无人机竞速的背景下。本研究引入了一种在照明变化下检测无人机竞速门的技术,这是高速无人机飞行中常见的。所提出的技术依赖于一个轻量级的神经网络骨架,通过持续学习来增强其能力。预计的方法将门的位置坐标、距离和方向预测结合为一个凝聚的姿势元组。一系列全面的测试结果证实了这种方法在面对多样且具有挑战性的场景时具有有效性,尤其是在涉及变异性照明条件的情况下。所提出的方法在照明变化面前表现出明显的稳健性,从而证实了其有效性。
https://arxiv.org/abs/2405.01054
The availability of high-quality datasets is crucial for the development of behavior prediction algorithms in autonomous vehicles. This paper highlights the need for standardizing the use of certain datasets for motion forecasting research to simplify comparative analysis and proposes a set of tools and practices to achieve this. Drawing on extensive experience and a comprehensive review of current literature, we summarize our proposals for preprocessing, visualizing, and evaluation in the form of an open-sourced toolbox designed for researchers working on trajectory prediction problems. The clear specification of necessary preprocessing steps and evaluation metrics is intended to alleviate development efforts and facilitate the comparison of results across different studies. The toolbox is available at: this https URL.
高质量数据集的可用性对于自动驾驶车辆中行为预测算法的开发至关重要。本文强调了在运动预测研究中标准化使用某些数据集的必要性,以简化比较分析,并提出了一系列工具和做法来实现这一目标。我们综合了广泛的经验和对当前文献的全面回顾,以提供一个为研究轨迹预测问题而设计的开源工具箱。明确的数据预处理步骤和评估指标的定义旨在减轻开发负担,并促进不同研究之间的结果比较。该工具箱可在以下链接访问:https://this URL。
https://arxiv.org/abs/2405.00604
A vision-based drone-to-drone detection system is crucial for various applications like collision avoidance, countering hostile drones, and search-and-rescue operations. However, detecting drones presents unique challenges, including small object sizes, distortion, occlusion, and real-time processing requirements. Current methods integrating multi-scale feature fusion and temporal information have limitations in handling extreme blur and minuscule objects. To address this, we propose a novel coarse-to-fine detection strategy based on vision transformers. We evaluate our approach on three challenging drone-to-drone detection datasets, achieving F1 score enhancements of 7%, 3%, and 1% on the FL-Drones, AOT, and NPS-Drones datasets, respectively. Additionally, we demonstrate real-time processing capabilities by deploying our model on an edge-computing device. Our code will be made publicly available.
基于视觉的无人机对无人机检测系统对于各种应用,如避障、应对敌对无人机和搜索与救援任务至关重要。然而,检测无人机存在独特的挑战,包括小物体尺寸、畸变、遮挡和实时处理需求。目前将多尺度特征融合和时间信息相结合的方法在处理极端模糊和微小物体方面存在局限。为了应对这一挑战,我们提出了一个基于视觉变压器的全新粗-到细检测策略。我们在FL-Drones、AOT和NPS-Drones等三个具有挑战性的无人机对无人机检测数据集上进行了评估,分别实现了FL-Drones数据集的F1得分提高7%、AOT数据集的F1得分提高3%和NPS-Drones数据集的F1得分提高1%。此外,通过将我们的模型部署在边缘计算设备上,我们还展示了实时处理能力。我们的代码将公开发布。
https://arxiv.org/abs/2404.19276
Multi-drone cooperative transport (CT) problem has been widely studied in the literature. However, limited work exists on control of such systems in the presence of time-varying uncertainties, such as the time-varying center of gravity (CG). This paper presents a leader-follower approach for the control of a multi-drone CT system with time-varying CG. The leader uses a traditional Proportional-Integral-Derivative (PID) controller, and in contrast, the follower uses a deep reinforcement learning (RL) controller using only local information and minimal leader information. Extensive simulation results are presented, showing the effectiveness of the proposed method over a previously developed adaptive controller and for variations in the mass of the objects being transported and CG speeds. Preliminary experimental work also demonstrates ball balance (depicting moving CG) on a stick/rod lifted by two Crazyflie drones cooperatively.
多无人机协同运输(CT)问题在文献中得到了广泛研究。然而,在存在时间变化不确定性的情况下,例如时间变化的重心(CG),控制这种系统的工作有限。本文提出了一种领导-跟随方法来控制具有时间变化CG的多无人机CT系统。领导者使用传统的比例-积分-微分(PID)控制器,而跟随者则使用仅使用局部信息和最小领导者信息的深度强化学习(RL)控制器。详细仿真结果表明,与之前开发的自适应控制器相比,所提出的方法在物体质量变化和CG速度变化方面都取得了显著效果。初步实验工作还展示了两个Crazyflie无人机合作抬起一根杆子时的球平衡(显示运动CG)。
https://arxiv.org/abs/2404.19070
We propose a novel failure-aware reactive UAV delivery service composition framework. A skyway network infrastructure is presented for the effective provisioning of services in urban areas. We present a formal drone delivery service model and a system architecture for reactive drone delivery services. We develop radius-based, cell density-based, and two-phased algorithms to reduce the search space and perform reactive service compositions when a service failure occurs. We conduct a set of experiments with a real drone dataset to demonstrate the effectiveness of our proposed approach.
我们提出了一个新颖的失效感知反应式无人机交付服务组合框架。为城市地区有效提供服务,我们呈现了一种形式化的无人机配送服务模型和响应式无人机交付服务的系统架构。我们开发了基于半径、密度和两阶段算法的失效感知服务组合,以减少搜索空间并实现服务失败时的反应式服务组合。我们使用实际无人机数据集进行了一系列实验,以证明我们提出的方法的有效性。
https://arxiv.org/abs/2404.18363