Autonomous

Adaptive Local Binary Pattern: A Novel Feature Descriptor for Enhanced Analysis of Kidney Abnormalities in CT Scan Images using ensemble based Machine Learning Approach

2024-04-22 20:15:43

Tahmim Hossain, Faisal Sayed, Solehin Islam

arXiv_CV

arXiv_CV Pose Autonomous Action
Abstract

The shortage of nephrologists and the growing public health concern over renal failure have spurred the demand for AI systems capable of autonomously detecting kidney abnormalities. Renal failure, marked by a gradual decline in kidney function, can result from factors like cysts, stones, and tumors. Chronic kidney disease may go unnoticed initially, leading to untreated cases until they reach an advanced stage. The dataset, comprising 12,427 images from multiple hospitals in Dhaka, was categorized into four groups: cyst, tumor, stone, and normal. Our methodology aims to enhance CT scan image quality using Cropping, Resizing, and CALHE techniques, followed by feature extraction with our proposed Adaptive Local Binary Pattern (A-LBP) feature extraction method compared with the state-of-the-art local binary pattern (LBP) method. Our proposed features fed into classifiers such as Random Forest, Decision Tree, Naive Bayes, K-Nearest Neighbor, and SVM. We explored an ensemble model with soft voting to get a more robust model for our task. We got the highest of more than 99% in accuracy using our feature descriptor and ensembling five classifiers (Random Forest, Decision Tree, Naive Bayes, K-Nearest Neighbor, Support Vector Machine) with the soft voting method.

Abstract (translated)

肾衰竭（肾衰竭）引起的肾小管短缺和公共卫生担忧的增加，促使人们对能够自主检测肾脏异常的AI系统产生需求。肾衰竭可能由囊肿、结石和肿瘤等因素引起。慢性肾衰竭可能最初被忽视，导致直到达到晚期才得到治疗。这个数据集包括来自达卡多家医院的12,427张图像，分为四类：囊肿、肿瘤、结石和正常。我们的方法旨在通过裁剪、缩放和局部二值化（CALHE）技术提高CT扫描图像质量，然后使用我们提出的自适应局部二值化（A-LBP）特征提取方法与最先进的局部二值化（LBP）方法进行特征提取。我们将提出的特征输入类学习器，如随机森林（Random Forest）、决策树（Decision Tree）、朴素贝叶斯（Naive Bayes）、K近邻（K-Nearest Neighbor）和支持向量机（SVM）。我们探讨了软投票的集成模型，以获得我们任务的更稳健的模型。我们使用软投票方法获得了超过99%的准确率。

URL

https://arxiv.org/abs/2404.14560

PDF

https://arxiv.org/pdf/2404.14560.pdf
Read All
Closing the Perception-Action Loop for Semantically Safe Navigation in Semi-Static Environments

2024-04-22 19:36:30

Jingxing Qian, Siqi Zhou, Nicholas Jianrui Ren, Veronica Chatrath, Angela P. Schoellig

arXiv_RO

arXiv_RO Autonomous Action
Abstract

Autonomous robots navigating in changing environments demand adaptive navigation strategies for safe long-term operation. While many modern control paradigms offer theoretical guarantees, they often assume known extrinsic safety constraints, overlooking challenges when deployed in real-world environments where objects can appear, disappear, and shift over time. In this paper, we present a closed-loop perception-action pipeline that bridges this gap. Our system encodes an online-constructed dense map, along with object-level semantic and consistency estimates into a control barrier function (CBF) to regulate safe regions in the scene. A model predictive controller (MPC) leverages the CBF-based safety constraints to adapt its navigation behaviour, which is particularly crucial when potential scene changes occur. We test the system in simulations and real-world experiments to demonstrate the impact of semantic information and scene change handling on robot behavior, validating the practicality of our approach.

Abstract (translated)

自主机器人导航在变化环境中需要适应性导航策略来实现安全长期操作。虽然许多现代控制范式提供了理论保证，但它们通常假定已知的外部安全约束，忽视了在现实环境中物体可能出现、消失和移动的事实挑战。在本文中，我们提出了一个端到端的感知-动作管道，弥合了这一空白。我们的系统编码了一个在线构建的密集地图以及物体级别的语义和一致性估计，作为一个控制障碍函数（CBF）来调节场景中的安全区域。一个模型预测控制器（MPC）利用基于CBF的安全约束来适应其导航行为，尤其是在可能发生场景变化时更是至关重要。我们在仿真和现实实验中测试了系统，以证明语义信息和场景变化处理对机器人行为的影响，验证了我们对方法的实用性。

URL

https://arxiv.org/abs/2404.14546

PDF

https://arxiv.org/pdf/2404.14546.pdf
Read All
A Survey on Self-Evolution of Large Language Models

2024-04-22 17:43:23

Zhengwei Tao, Ting-En Lin, Xiancai Chen, Hangyu Li, Yuchuan Wu, Yongbin Li, Zhi Jin, Fei Huang, Dacheng Tao, Jingren Zhou

arXiv_AI

arXiv_AI Survey Face Language_Model Pose Autonomous Agent
Abstract

Large language models (LLMs) have significantly advanced in various fields and intelligent agent applications. However, current LLMs that learn from human or external model supervision are costly and may face performance ceilings as task complexity and diversity increase. To address this issue, self-evolution approaches that enable LLM to autonomously acquire, refine, and learn from experiences generated by the model itself are rapidly growing. This new training paradigm inspired by the human experiential learning process offers the potential to scale LLMs towards superintelligence. In this work, we present a comprehensive survey of self-evolution approaches in LLMs. We first propose a conceptual framework for self-evolution and outline the evolving process as iterative cycles composed of four phases: experience acquisition, experience refinement, updating, and evaluation. Second, we categorize the evolution objectives of LLMs and LLM-based agents; then, we summarize the literature and provide taxonomy and insights for each module. Lastly, we pinpoint existing challenges and propose future directions to improve self-evolution frameworks, equipping researchers with critical insights to fast-track the development of self-evolving LLMs.

Abstract (translated)

大语言模型（LLMs）在各种领域和智能机器人应用方面取得了显著的进步。然而，当前从人类或外部模型监督中学习的LLM成本较高，且随着任务复杂性和多样性的增加，可能面临性能上限。为解决这个问题，自进化方法使LLM能够自主获取、精炼并从模型自身生成的经验中学习，正在快速发展。这一新训练范式在很大程度上受到了人类经验学习的启发，为LLM达到超级智能提供了潜力。在这项工作中，我们全面调查了LLM的自进化方法。我们首先提出了一个自进化的概念框架，并概述了自进化的演变过程由四个阶段组成：经验获取、经验精炼、更新和评估。接下来，我们分类了LLM和基于LLM的智能代理的演化目标；然后，我们总结了文献，并为每个模块提供了分类和见解。最后，我们指出了现有挑战，并为改善自进化框架提出了未来方向，以便研究人员能够关键性地快速推进LLM的自进化发展。

URL

https://arxiv.org/abs/2404.14387

PDF

https://arxiv.org/pdf/2404.14387.pdf
Read All
PLUTO: Pushing the Limit of Imitation Learning-based Planning for Autonomous Driving

2024-04-22 16:38:41

Jie Cheng, Yingbing Chen, Qifeng Chen

arXiv_RO

arXiv_RO Autonomous Action Contrastive_Learning
Abstract

We present PLUTO, a powerful framework that pushes the limit of imitation learning-based planning for autonomous driving. Our improvements stem from three pivotal aspects: a longitudinal-lateral aware model architecture that enables flexible and diverse driving behaviors; An innovative auxiliary loss computation method that is broadly applicable and efficient for batch-wise calculation; A novel training framework that leverages contrastive learning, augmented by a suite of new data augmentations to regulate driving behaviors and facilitate the understanding of underlying interactions. We assessed our framework using the large-scale real-world nuPlan dataset and its associated standardized planning benchmark. Impressively, PLUTO achieves state-of-the-art closed-loop performance, beating other competing learning-based methods and surpassing the current top-performed rule-based planner for the first time. Results and code are available at this https URL.

Abstract (translated)

我们提出了PLUTO，一个强大的框架，可以将自动驾驶中基于模仿学习的规划极限推向更高。我们的改进源于三个关键方面：一个纵向-横向感知模型架构，实现灵活多样且和谐的驾驶行为；一种适用于批量计算的创新辅助损失计算方法；一种利用对比学习的新颖训练框架，通过一系列新的数据增强方法调节驾驶行为，并促进底层交互的理解。我们对PLUTO框架进行了评估，使用了大规模现实世界nuPlan数据集及其相关的标准化规划基准。令人印象深刻的是，PLUTO实现了最先进的闭环性能，超越了其他竞争性的基于学习的方法和当前最高表现的基于规则的规划器，这是第一次实现的。结果和代码可在此链接中查看：https://url.org/

URL

https://arxiv.org/abs/2404.14327

PDF

https://arxiv.org/pdf/2404.14327.pdf
Read All
Autonomous Forest Inventory with Legged Robots: System Design and Field Deployment

2024-04-22 13:13:14

Matías Mattamala, Nived Chebrolu, Benoit Casseau, Leonard Freißmuth, Jonas Frey, Turcan Tuna, Marco Hutter, Maurice Fallon

arXiv_RO

arXiv_RO Segmentation Survey Autonomous
Abstract

We present a solution for autonomous forest inventory with a legged robotic platform. Compared to their wheeled and aerial counterparts, legged platforms offer an attractive balance of endurance and low soil impact for forest applications. In this paper, we present the complete system architecture of our forest inventory solution which includes state estimation, navigation, mission planning, and real-time tree segmentation and trait estimation. We present preliminary results for three campaigns in forests in Finland and the UK and summarize the main outcomes, lessons, and challenges. Our UK experiment at the Forest of Dean with the ANYmal D legged platform, achieved an autonomous survey of a 0.96 hectare plot in 20 min, identifying over 100 trees with typical DBH accuracy of 2 cm.

Abstract (translated)

我们提出了一个具有腿式机器人平台的自主森林清查解决方案。与轮式和空中平台相比，腿式平台具有持久的耐久性和对土壤的低影响，对于森林应用具有吸引力。在本文中，我们介绍了我们森林清查方案的完整系统架构，包括状态估计、导航、任务规划和实时树分割和特征估计。我们总结了三个国家（芬兰和英国）的森林中的初步结果，包括主要结论、经验教训和挑战。在德文森林中使用ANYmal D腿式平台进行的实验，在20分钟内实现了对0.96公顷铺面的自主调查，识别了超过100棵具有典型DBH精度的树木。

URL

https://arxiv.org/abs/2404.14157

PDF

https://arxiv.org/pdf/2404.14157.pdf
Read All
Human Orientation Estimation under Partial Observation

2024-04-22 12:45:04

Jieting Zhao, Hanjing Ye, Yu Zhan, Hong Zhang

arXiv_RO

arXiv_RO Prediction Autonomous Action Agent
Abstract

Reliable human orientation estimation (HOE) is critical for autonomous agents to understand human intention and perform human-robot interaction (HRI) tasks. Great progress has been made in HOE under full observation. However, the existing methods easily make a wrong prediction under partial observation and give it an unexpectedly high probability. To solve the above problems, this study first develops a method that estimates orientation from the visible joints of a target person so that it is able to handle partial observation. Subsequently, we introduce a confidence-aware orientation estimation method, enabling more accurate orientation estimation and reasonable confidence estimation under partial observation. The effectiveness of our method is validated on both public and custom-built datasets, and it showed great accuracy and reliability improvement in partial observation scenarios. In particular, we show in real experiments that our method can benefit the robustness and consistency of the robot person following (RPF) task.

Abstract (translated)

可靠的人体方向估计（HOE）对于自主机器人来说理解人类意图并执行人机交互（HRI）任务至关重要。在完全观察的情况下，HOE取得了很大的进展。然而，现有的方法在部分观察时很容易做出错误的预测，并且给出了意外高的概率。为了解决上述问题，本研究首先开发了一种从目标人物的可视关节估计方向的方法，使其能够处理部分观察。接着，我们引入了一种基于信心的方向估计方法，使得在部分观察的情况下进行更准确的方向估计和合理的自信估计。我们方法的成效在公开和定制数据集上都被验证，并在部分观察场景中取得了很大精度和可靠性提升。特别地，在实际实验中，我们证明了我们的方法可以提高机器人人们在（RPF）任务中的鲁棒性和一致性。

URL

https://arxiv.org/abs/2404.14139

PDF

https://arxiv.org/pdf/2404.14139.pdf
Read All
Immersive Rover Control and Obstacle Detection based on Extended Reality and Artificial Intelligence

2024-04-22 11:28:34

Sofía Coloma, Alexandre Frantz, Dave van der Meer, Ernest Skrzypczyk, Andrej Orsula, Miguel Olivares-Mendez

arXiv_RO

arXiv_RO Detection Face Pose Autonomous 3D
Abstract

Lunar exploration has become a key focus, driving scientific and technological advances. Ongoing missions are deploying rovers to the surface of the Moon, targeting the far side and south pole. However, these terrains pose challenges, emphasizing the need for precise obstacles and resource detection to avoid mission risks. This work proposes a novel system that integrates eXtended Reality (XR) and Artificial Intelligence (AI) to teleoperate lunar rovers. It is capable of autonomously detecting rocks and recreating an immersive 3D virtual environment of the location of the robot. This system has been validated in a lunar laboratory to observe its advantages over traditional 2D-based teleoperation approaches

Abstract (translated)

登月探险已成为一个关键的重点，推动了科学和技术的进步。正在进行的研究任务正在向月球表面部署漫游车，针对远端和南极。然而，这些地形带来了挑战，强调了在避免任务风险方面需要精确的障碍和资源检测的重要性。这项工作提出了一种集成增强现实（XR）和人工智能（AI）的新型系统，用于遥控月球漫游车。它能够自主检测岩石并创建机器人所在位置的沉浸式3D虚拟环境。已经在月球实验室验证了该系统的优势，与传统的2D基于遥控方法相比。

URL

https://arxiv.org/abs/2404.14095

PDF

https://arxiv.org/pdf/2404.14095.pdf
Read All
CloudFort: Enhancing Robustness of 3D Point Cloud Classification Against Backdoor Attacks via Spatial Partitioning and Ensemble Prediction

2024-04-22 09:55:50

Wenhao Lan, Yijun Yang, Haihua Shen, Shan Li

arXiv_CV

arXiv_CV Recognition Classification Prediction Pose Autonomous 3D Point_Cloud
Abstract

The increasing adoption of 3D point cloud data in various applications, such as autonomous vehicles, robotics, and virtual reality, has brought about significant advancements in object recognition and scene understanding. However, this progress is accompanied by new security challenges, particularly in the form of backdoor attacks. These attacks involve inserting malicious information into the training data of machine learning models, potentially compromising the model's behavior. In this paper, we propose CloudFort, a novel defense mechanism designed to enhance the robustness of 3D point cloud classifiers against backdoor attacks. CloudFort leverages spatial partitioning and ensemble prediction techniques to effectively mitigate the impact of backdoor triggers while preserving the model's performance on clean data. We evaluate the effectiveness of CloudFort through extensive experiments, demonstrating its strong resilience against the Point Cloud Backdoor Attack (PCBA). Our results show that CloudFort significantly enhances the security of 3D point cloud classification models without compromising their accuracy on benign samples. Furthermore, we explore the limitations of CloudFort and discuss potential avenues for future research in the field of 3D point cloud security. The proposed defense mechanism represents a significant step towards ensuring the trustworthiness and reliability of point-cloud-based systems in real-world applications.

Abstract (translated)

3D点云数据的日益广泛应用，如自动驾驶、机器人学和虚拟现实，带来了物体识别和场景理解方面的显著进步。然而，这一进步伴随着新的安全挑战，特别是后门攻击。这些攻击涉及在机器学习模型的训练数据中插入恶意信息，可能危及模型的行为。在本文中，我们提出了CloudFort，一种专门设计用于增强3D点云分类器对后门攻击的鲁棒性的新颖防御机制。CloudFort利用空间分割和集成预测技术，有效减轻后门触发器对模型的影响，同时保留模型在干净数据上的性能。我们通过广泛的实验评估了CloudFort的有效性，证明了它对点云后门攻击（PCBA）具有很强的抵抗力。我们的结果表明，CloudFort显著增强了不牺牲准确性的3D点云分类模型的安全性。此外，我们探讨了CloudFort的局限性，并讨论了该领域未来研究的潜在方向。所提出的防御机制在确保基于点云的系统的可靠性和可信度方面迈出了重要的一步。

URL

https://arxiv.org/abs/2404.14042

PDF

https://arxiv.org/pdf/2404.14042.pdf
Read All
PointDifformer: Robust Point Cloud Registration With Neural Diffusion and Transformer

2024-04-22 09:50:12

Rui She, Qiyu Kang, Sijie Wang, Wee Peng Tay, Kai Zhao, Yang Song, Tianyu Geng, Yi Xu, Diego Navarro Navarro, Andreas Hartmannsgruber

arXiv_CV

arXiv_CV Attention Transformer Pose Autonomous Point_Cloud Diffusion
Abstract

Point cloud registration is a fundamental technique in 3-D computer vision with applications in graphics, autonomous driving, and robotics. However, registration tasks under challenging conditions, under which noise or perturbations are prevalent, can be difficult. We propose a robust point cloud registration approach that leverages graph neural partial differential equations (PDEs) and heat kernel signatures. Our method first uses graph neural PDE modules to extract high dimensional features from point clouds by aggregating information from the 3-D point neighborhood, thereby enhancing the robustness of the feature representations. Then, we incorporate heat kernel signatures into an attention mechanism to efficiently obtain corresponding keypoints. Finally, a singular value decomposition (SVD) module with learnable weights is used to predict the transformation between two point clouds. Empirical experiments on a 3-D point cloud dataset demonstrate that our approach not only achieves state-of-the-art performance for point cloud registration but also exhibits better robustness to additive noise or 3-D shape perturbations.

Abstract (translated)

点云配准是3D计算机视觉中的一个基本技术，应用于图形学、自动驾驶和机器人领域。然而，在具有噪声或扰动的环境下，配准任务可能会变得困难。我们提出了一种鲁棒的点云配准方法，它利用图神经 partial differential equations (PDEs) 和热核签名。我们的方法首先使用图神经 PDE 模块从点云中提取高维特征，通过聚合来自3D点邻域的信息来增强特征表示的鲁棒性。然后，我们将热核签名纳入关注机制，以高效地获得相应的关键点。最后，使用带可学习权重的单值分解（SVD）模块预测两个点云之间的变换。在3D点云数据集的实证实验中，我们的方法不仅实现了点云配准的尖端性能，还表现出了对添加噪声或3D形状扰动的鲁棒性更好。

URL

https://arxiv.org/abs/2404.14034

PDF

https://arxiv.org/pdf/2404.14034.pdf
Read All
Collaborative Perception Datasets in Autonomous Driving: A Survey

2024-04-22 09:36:17

Melih Yazgan, Mythra Varun Akkanapragada, J. Marius Zoellner

arXiv_CV

arXiv_CV Survey Autonomous
Abstract

This survey offers a comprehensive examination of collaborative perception datasets in the context of Vehicle-to-Infrastructure (V2I), Vehicle-to-Vehicle (V2V), and Vehicle-to-Everything (V2X). It highlights the latest developments in large-scale benchmarks that accelerate advancements in perception tasks for autonomous vehicles. The paper systematically analyzes a variety of datasets, comparing them based on aspects such as diversity, sensor setup, quality, public availability, and their applicability to downstream tasks. It also highlights the key challenges such as domain shift, sensor setup limitations, and gaps in dataset diversity and availability. The importance of addressing privacy and security concerns in the development of datasets is emphasized, regarding data sharing and dataset creation. The conclusion underscores the necessity for comprehensive, globally accessible datasets and collaborative efforts from both technological and research communities to overcome these challenges and fully harness the potential of autonomous driving.

Abstract (translated)

这项调查对车辆与基础设施（V2I）、车辆与车辆（V2V）和车辆与一切（V2X）背景下的协作感知数据集进行全面评估。它突出了在自动驾驶车辆感知任务方面推动进展的大型基准测试的最新发展。论文系统地分析了各种数据集，根据多样性、传感器设置、质量、公共可用性和它们对下游任务的适用性等方面进行比较。还强调了领域转移、传感器设置限制和数据集多样性和可用性之间的关键挑战。关于在数据集开发过程中解决隐私和安全问题的重要性进行了强调，涉及数据共享和数据创建。结论强调了在自动驾驶车辆的发展过程中，需要全面、全球可访问的数据和来自技术和研究社区的协作努力，以克服这些挑战并充分利用自动驾驶技术的潜力。

URL

https://arxiv.org/abs/2404.14022

PDF

https://arxiv.org/pdf/2404.14022.pdf
Read All
Challenges in automatic and selective plant-clearing

2024-04-22 09:01:14

Fabrice Mayran de Chamisso, Loïc Cotten, Valentine Dhers, Thomas Lompech, Florian Seywert, Arnaud Susset

arXiv_CV

arXiv_CV Segmentation Pose Autonomous
Abstract

With the advent of multispectral imagery and AI, there have been numerous works on automatic plant segmentation for purposes such as counting, picking, health monitoring, localized pesticide delivery, etc. In this paper, we tackle the related problem of automatic and selective plant-clearing in a sustainable forestry context, where an autonomous machine has to detect and avoid specific plants while clearing any weeds which may compete with the species being cultivated. Such an autonomous system requires a high level of robustness to weather conditions, plant variability, terrain and weeds while remaining cheap and easy to maintain. We notably discuss the lack of robustness of spectral imagery, investigate the impact of the reference database's size and discuss issues specific to AI systems operating in uncontrolled environments.

Abstract (translated)

随着多光谱图像和人工智能（AI）的出现，已经出现了大量用于自动植物分割的应用，例如计数、采摘、健康监测、局部农药配送等。在本文中，我们研究了在可持续林业背景下，自动且选择性地清除植物的相关问题，其中一台自主机器需要检测和避开可能与被栽培的物种竞争的特定植物。这样的自主系统需要具备很高的适应性来应对天气条件、植物多样性、地形和杂草，同时保持低成本和易于维护。我们特别讨论了光谱图像的适应性不足，研究了参考数据库的大小，并讨论了运行在不受控环境中的AI系统所面临的问题。

URL

https://arxiv.org/abs/2404.13996

PDF

https://arxiv.org/pdf/2404.13996.pdf
Read All
Toward Robust LiDAR based 3D Object Detection via Density-Aware Adaptive Thresholding

2024-04-22 03:31:34

Eunho Lee, Minwoo Jung, Ayoung Kim

arXiv_RO

arXiv_RO Detection Object_Detection Pose Autonomous 3D Enhancement
Abstract

Robust 3D object detection is a core challenge for autonomous mobile systems in field robotics. To tackle this issue, many researchers have demonstrated improvements in 3D object detection performance in datasets. However, real-world urban scenarios with unstructured and dynamic situations can still lead to numerous false positives, posing a challenge for robust 3D object detection models. This paper presents a post-processing algorithm that dynamically adjusts object detection thresholds based on the distance from the ego-vehicle. 3D object detection models usually perform well in detecting nearby objects but may exhibit suboptimal performance for distant ones. While conventional perception algorithms typically employ a single threshold in post-processing, the proposed algorithm addresses this issue by employing adaptive thresholds based on the distance from the ego-vehicle, minimizing false negatives and reducing false positives in urban scenarios. The results show performance enhancements in 3D object detection models across a range of scenarios, not only in dynamic urban road conditions but also in scenarios involving adverse weather conditions.

Abstract (translated)

稳健的三维物体检测是场机器人领域自动驾驶移动系统的一个核心挑战。为了解决这个问题，许多研究人员在数据集上展示了三维物体检测性能的提高。然而，现实世界中的无结构和动态情况仍然可能导致大量的误检，挑战了稳健的三维物体检测模型的应用。本文提出了一种基于自车距离调整物体检测阈值的后处理算法。三维物体检测模型通常在检测附近物体时表现良好，但对于较远的物体可能表现不佳。而传统感知算法通常在后处理阶段使用单个阈值，该算法通过基于自车距离的自适应阈值来解决这一问题，从而在城市场景中减少误检和提高误检率。结果显示，在各种场景中，三维物体检测模型的性能都有所提高，不仅在动态城市道路条件下，而且在涉及恶劣天气条件的场景中也是如此。

URL

https://arxiv.org/abs/2404.13852

PDF

https://arxiv.org/pdf/2404.13852.pdf
Read All
On Support Relations Inference and Scene Hierarchy Graph Construction from Point Cloud in Clustered Environments

2024-04-22 02:42:32

Gang Ma, Hui Wei

arXiv_CV

arXiv_CV Detection Classification Relation Inference Optimization Autonomous 3D Point_Cloud
Abstract

Over the years, scene understanding has attracted a growing interest in computer vision, providing the semantic and physical scene information necessary for robots to complete some particular tasks autonomously. In 3D scenes, rich spatial geometric and topological information are often ignored by RGB-based approaches for scene understanding. In this study, we develop a bottom-up approach for scene understanding that infers support relations between objects from a point cloud. Our approach utilizes the spatial topology information of the plane pairs in the scene, consisting of three major steps. 1) Detection of pairwise spatial configuration: dividing primitive pairs into local support connection and local inner connection; 2) primitive classification: a combinatorial optimization method applied to classify primitives; and 3) support relations inference and hierarchy graph construction: bottom-up support relations inference and scene hierarchy graph construction containing primitive level and object level. Through experiments, we demonstrate that the algorithm achieves excellent performance in primitive classification and support relations inference. Additionally, we show that the scene hierarchy graph contains rich geometric and topological information of objects, and it possesses great scalability for scene understanding.

Abstract (translated)

在过去的几年里，场景理解在计算机视觉领域吸引了越来越多的关注，为机器人完成某些特定任务提供了语义和物理场景信息。在3D场景中，基于RGB的 scene understanding 方法通常会忽略场景中的丰富空间几何和拓扑信息。在这项研究中，我们提出了一种自下而上的场景理解方法，推断出场景中物体之间的支持关系。我们的方法基于场景平面对的空间拓扑信息，包括三个主要步骤。1) 对对间空间配置的检测：将基本对分为局部支持连接和局部内连接；2) 基本分类：将基本分类为组合优化方法；和 3) 支持关系推断和层次图构建：自下而上支持关系推断和场景层次图构建包含基本水平和物体水平。通过实验，我们证明了该算法在基本分类和支持关系推理方面具有优异的性能。此外，我们还证明了场景层次图包含丰富的几何和拓扑信息，具有很好的可扩展性。

URL

https://arxiv.org/abs/2404.13842

PDF

https://arxiv.org/pdf/2404.13842.pdf
Read All
Neural Radiance Field in Autonomous Driving: A Survey

2024-04-22 01:36:50

Lei He, Leheng Li, Wenchao Sun, Zeyu Han, Yichen Liu, Sifa Zheng, Jianqiang Wang, Keqiang Li

arXiv_CV

arXiv_CV Deep_Learning Survey Attention SLAM Knowledge Autonomous 3D Reconstruction
Abstract

Neural Radiance Field (NeRF) has garnered significant attention from both academia and industry due to its intrinsic advantages, particularly its implicit representation and novel view synthesis capabilities. With the rapid advancements in deep learning, a multitude of methods have emerged to explore the potential applications of NeRF in the domain of Autonomous Driving (AD). However, a conspicuous void is apparent within the current literature. To bridge this gap, this paper conducts a comprehensive survey of NeRF's applications in the context of AD. Our survey is structured to categorize NeRF's applications in Autonomous Driving (AD), specifically encompassing perception, 3D reconstruction, simultaneous localization and mapping (SLAM), and simulation. We delve into in-depth analysis and summarize the findings for each application category, and conclude by providing insights and discussions on future directions in this field. We hope this paper serves as a comprehensive reference for researchers in this domain. To the best of our knowledge, this is the first survey specifically focused on the applications of NeRF in the Autonomous Driving domain.

Abstract (translated)

Neural Radiance Field（NeRF）因其固有优势，特别是其隐式表示和新视图合成能力，在学术界和产业界都引起了显著关注。随着深度学习的快速发展，为探索NeRF在自动驾驶（AD）领域的潜在应用，已经涌现出了许多方法。然而，当前文献中显然存在一个明显的空白。为了填补这一空白，本文对NeRF在AD领域中的应用进行全面调查。我们的调查旨在对NeRF的每个应用进行分类，包括感知、3D重建、同时定位与映射（SLAM）和仿真。我们深入分析每个应用类别，并总结了每个应用类别的发现。最后，我们提供了关于未来该领域的发展方向以及见解和讨论。我们希望，本文将成为该领域研究人员的全面参考。据我们所知，这是第一部专门关注NeRF在AD领域应用的调查。

URL

https://arxiv.org/abs/2404.13816

PDF

https://arxiv.org/pdf/2404.13816.pdf
Read All
Soar: Design and Deployment of A Smart Roadside Infrastructure System for Autonomous Driving

2024-04-21 21:45:23

Shuyao Shi, Neiwen Ling, Zhehao Jiang, Xuan Huang, Yuze He, Xiaoguang Zhao, Bufang Yang, Chen Bian, Jingfei Xia, Zhenyu Yan, Raymond Yeung, Guoliang Xing

arXiv_AI

arXiv_AI Face Autonomous
Abstract

Recently,smart roadside infrastructure (SRI) has demonstrated the potential of achieving fully autonomous driving systems. To explore the potential of infrastructure-assisted autonomous driving, this paper presents the design and deployment of Soar, the first end-to-end SRI system specifically designed to support autonomous driving systems. Soar consists of both software and hardware components carefully designed to overcome various system and physical challenges. Soar can leverage the existing operational infrastructure like street lampposts for a lower barrier of adoption. Soar adopts a new communication architecture that comprises a bi-directional multi-hop I2I network and a downlink I2V broadcast service, which are designed based on off-the-shelf 802.11ac interfaces in an integrated manner. Soar also features a hierarchical DL task management framework to achieve desirable load balancing among nodes and enable them to collaborate efficiently to run multiple data-intensive autonomous driving applications. We deployed a total of 18 Soar nodes on existing lampposts on campus, which have been operational for over two years. Our real-world evaluation shows that Soar can support a diverse set of autonomous driving applications and achieve desirable real-time performance and high communication reliability. Our findings and experiences in this work offer key insights into the development and deployment of next-generation smart roadside infrastructure and autonomous driving systems.

Abstract (translated)

近年来，智能路边设施（SRI）已经展示了实现完全自动驾驶系统的潜力。为了探索基于基础设施的自动驾驶系统的潜力，本文提出了Soar，第一个专门支持自动驾驶系统的端到端SRI系统的设计和部署。Soar由软件和硬件组件精心设计，以克服各种系统和物理挑战。Soar可以利用现有的道路路灯等运营基础设施，具有较低的采用门槛。Soar采用了一种新的通信架构，包括双向多跳的I2I网络和下行的I2V广播服务，这些服务基于集成802.11ac接口。Soar还具有分层DL任务管理框架，以实现节点之间可观的负载均衡，并使它们能够有效协作运行多个数据密集的自动驾驶应用程序。我们在校园内的18个现有路灯上部署了Soar节点，这些路灯已经运营了两年多。我们的实际评估结果表明，Soar可以支持各种自动驾驶应用程序，实现可观的真实世界性能和高的通信可靠性。本文的研究成果和经验为下一代智能路边设施和自动驾驶系统的开发和部署提供了关键见解。

URL

https://arxiv.org/abs/2404.13786

PDF

https://arxiv.org/pdf/2404.13786.pdf
Read All
Autonomous Robot for Disaster Mapping and Victim Localization

2024-04-21 20:32:02

Michael Potter, Rahil Bhowal, Richard Zhao, Anuj Patel, Jingming Cheng

arXiv_AI

arXiv_AI Autonomous
Abstract

In response to the critical need for effective reconnaissance in disaster scenarios, this research article presents the design and implementation of a complete autonomous robot system using the Turtlebot3 with Robotic Operating System (ROS) Noetic. Upon deployment in closed, initially unknown environments, the system aims to generate a comprehensive map and identify any present 'victims' using AprilTags as stand-ins. We discuss our solution for search and rescue missions, while additionally exploring more advanced algorithms to improve search and rescue functionalities. We introduce a Cubature Kalman Filter to help reduce the mean squared error [m] for AprilTag localization and an information-theoretic exploration algorithm to expedite exploration in unknown environments. Just like turtles, our system takes it slow and steady, but when it's time to save the day, it moves at ninja-like speed! Despite Donatello's shell, he's no slowpoke - he zips through obstacles with the agility of a teenage mutant ninja turtle. So, hang on tight to your shells and get ready for a whirlwind of reconnaissance! Full pipeline code this https URL Exploration code this https URL

Abstract (translated)

为了应对灾难场景中有效的侦察需求，本文提出了一种使用Turtlebot3和Robotic Operating System (ROS) Noetic构建完整的自主机器人系统的设计和实现。在部署到封闭、最初未知的环境中后，系统旨在生成全面地图，并使用AprilTags作为替代品识别出任何潜在的“受害者”。我们讨论了我们的搜救任务解决方案，同时探索更高级别的算法以提高搜救功能。我们引入了立方体卡尔曼滤波器来帮助减少AprilTag定位的平均平方误差[m]，并介绍了信息论探索算法来加速未知环境中的探索。就像乌龟一样，我们的系统稳中求进，但当需要取得胜利的时候，它就像忍者一样快速移动！尽管Donatello的壳，他也不是个慢吞吞的，他像青少年突变忍者一样灵活地穿过障碍物。所以，紧握你的壳，准备迎接一场侦察狂潮吧！完整管道代码，https://URL；探索代码，https://URL

URL

https://arxiv.org/abs/2404.13767

PDF

https://arxiv.org/pdf/2404.13767.pdf
Read All
Seamless Underwater Navigation with Limited Doppler Velocity Log Measurements

2024-04-21 18:56:54

Nadav Cohen, Itzik Klein

arXiv_AI

arXiv_AI Pose Autonomous
Abstract

Autonomous Underwater Vehicles (AUVs) commonly utilize an inertial navigation system (INS) and a Doppler velocity log (DVL) for underwater navigation. To that end, their measurements are integrated through a nonlinear filter such as the extended Kalman filter (EKF). The DVL velocity vector estimate depends on retrieving reflections from the seabed, ensuring that at least three out of its four transmitted acoustic beams return successfully. When fewer than three beams are obtained, the DVL cannot provide a velocity update to bind the navigation solution drift. To cope with this challenge, in this paper, we propose a hybrid neural coupled (HNC) approach for seamless AUV navigation in situations of limited DVL measurements. First, we drive an approach to regress two or three missing DVL beams. Then, those beams, together with the measured beams, are incorporated into the EKF. We examined INS/DVL fusion both in loosely and tightly coupled approaches. Our method was trained and evaluated on recorded data from AUV experiments conducted in the Mediterranean Sea on two different occasions. The results illustrate that our proposed method outperforms the baseline loosely and tightly coupled model-based approaches by an average of 96.15%. It also demonstrates superior performance compared to a model-based beam estimator by an average of 12.41% in terms of velocity accuracy for scenarios involving two or three missing beams. Therefore, we demonstrate that our approach offers seamless AUV navigation in situations of limited beam measurements.

Abstract (translated)

自主水下航行器（AUV）通常使用惯性导航系统（INS）和多普勒速度日志（DVL）进行水下导航。为此，它们的测量通过非线性滤波器如扩展卡尔曼滤波器（EKF）进行整合。DVL速度向量估计取决于从海底回收反射波，确保其四个传输声波中至少有三个成功返回。当获得的反射波少于三个时，DVL无法提供速度更新来绑定导航解决方案的漂移。为了应对这个挑战，在本文中，我们提出了一个在有限DVL测量情况下实现无缝AUV导航的混合神经网络耦合（HNC）方法。首先，我们驱动一种回归两个或三个缺失DVL波的方法。然后，将测量到的波与DVL波一起并入扩展卡尔曼滤波器（EKF）。我们研究了INS/DVL融合在松散和紧密耦合方法下的情况。对AUV实验在 Mediterranean Sea 两次不同场合记录的数据进行了训练和评估。结果表明，与基线松散和紧密耦合模型方法相比，我们提出的方法平均提高了96.15%。此外，与基于模型的波估计器相比，在涉及两个或三个缺失波的场景中，我们的方法具有平均12.41%的更高速度准确性。因此，我们证明了在有限波测量的情况下，我们的方法可以实现无缝AUV导航。

URL

https://arxiv.org/abs/2404.13742

PDF

https://arxiv.org/pdf/2404.13742.pdf
Read All
A Practical Multilevel Governance Framework for Autonomous and Intelligent Systems

2024-04-21 17:15:43

Lukas D. Pöhler, Klaus Diepold, Wendell Wallach

arXiv_AI

arXiv_AI GAN Pose Autonomous
Abstract

Autonomous and intelligent systems (AIS) facilitate a wide range of beneficial applications across a variety of different domains. However, technical characteristics such as unpredictability and lack of transparency, as well as potential unintended consequences, pose considerable challenges to the current governance infrastructure. Furthermore, the speed of development and deployment of applications outpaces the ability of existing governance institutions to put in place effective ethical-legal oversight. New approaches for agile, distributed and multilevel governance are needed. This work presents a practical framework for multilevel governance of AIS. The framework enables mapping actors onto six levels of decision-making including the international, national and organizational levels. Furthermore, it offers the ability to identify and evolve existing tools or create new tools for guiding the behavior of actors within the levels. Governance mechanisms enable actors to shape and enforce regulations and other tools, which when complemented with good practices contribute to effective and comprehensive governance.

Abstract (translated)

自动驾驶和智能系统（AIS）在各种不同的领域为广泛的利益应用提供了便利。然而，技术的特点，如不可预测性和缺乏透明度，以及潜在的意外后果，给当前的治理基础设施带来了巨大的挑战。此外，应用程序的开发和部署速度超过了现有治理机构实施有效伦理和法律监督的能力。需要新的敏捷、分布式和多层治理方法。本研究提出了一个多层治理AIS的实用框架。该框架允许将行动者映射到包括国际、国家和组织层面在内的六个决策层次。此外，它还提供了识别和演变现有工具或为引导行动者在各个层面行为创建新工具的能力。治理机制使行动者能够塑造和实施法规和其他工具，当与良好实践相结合时，有助于实现有效和全面的治理。

URL

https://arxiv.org/abs/2404.13719

PDF

https://arxiv.org/pdf/2404.13719.pdf
Read All
A Complete System for Automated 3D Semantic-Geometric Mapping of Corrosion in Industrial Environments

2024-04-21 15:40:32

Rui Pimentel de Figueiredo, Stefan Nordborg Eriksen, Ignacio Rodriguez, Simon Bøgh

arXiv_CV

arXiv_CV Segmentation Semantic_Segmentation Detection Deep_Learning Quantitative Pose Autonomous 3D
Abstract

Corrosion, a naturally occurring process leading to the deterioration of metallic materials, demands diligent detection for quality control and the preservation of metal-based objects, especially within industrial contexts. Traditional techniques for corrosion identification, including ultrasonic testing, radio-graphic testing, and magnetic flux leakage, necessitate the deployment of expensive and bulky equipment on-site for effective data acquisition. An unexplored alternative involves employing lightweight, conventional camera systems, and state-of-the-art computer vision methods for its identification. In this work, we propose a complete system for semi-automated corrosion identification and mapping in industrial environments. We leverage recent advances in LiDAR-based methods for localization and mapping, with vision-based semantic segmentation deep learning techniques, in order to build semantic-geometric maps of industrial environments. Unlike previous corrosion identification systems available in the literature, our designed multi-modal system is low-cost, portable, semi-autonomous and allows collecting large datasets by untrained personnel. A set of experiments in an indoor laboratory environment, demonstrate quantitatively the high accuracy of the employed LiDAR based 3D mapping and localization system, with less then $0.05m$ and 0.02m average absolute and relative pose errors. Also, our data-driven semantic segmentation model, achieves around 70\% precision when trained with our pixel-wise manually annotated dataset.

Abstract (translated)

腐蚀是一种自然发生的导致金属材料退化的过程，在工业环境中对质量控制和金属物体保存而言，需要进行严谨的检测。传统的腐蚀识别技术，包括超声波测试、无线电graphic测试和磁通泄漏，需要在现场部署昂贵且笨重的设备以实现有效数据采集。一种未探索的替代方法是采用轻量级、传统的相机系统和最先进的计算机视觉技术来进行腐蚀识别和绘制。在这项工作中，我们提出了一个工业环境中的半自动腐蚀识别和绘图系统。我们利用了最近在基于激光雷达的定位和绘图方法以及基于视觉的语义分割深度学习技术，构建了工业环境的语义-几何地图。与文中的 previous corrosion identification systems 不同，我们所设计的 multimodal system 成本低、便携、半自动化，并允许非受过训练的人员收集大量数据。在室内实验室环境下进行的一系列实验，证明了所采用的基于激光雷达的3D 定位和定位系统的高准确性，平均绝对和相对姿态误差小于 $0.05m$ 和 $0.02m$。此外，我们的数据驱动语义分割模型，在用我们逐像素手动标注的数据集上进行训练时，实现了约 70\% 的精确度。

URL

https://arxiv.org/abs/2404.13691

PDF

https://arxiv.org/pdf/2404.13691.pdf
Read All
Adaptive Social Force Window Planner with Reinforcement Learning

2024-04-21 14:41:40

Mauro Martini, Noé Pérez-Higueras, Andrea Ostuni, Marcello Chiaberge, Fernando Caballero, Luis Merino

arXiv_RO

arXiv_RO Reinforcement_Learning Pose Autonomous Agent
Abstract

Human-aware navigation is a complex task for mobile robots, requiring an autonomous navigation system capable of achieving efficient path planning together with socially compliant behaviors. Social planners usually add costs or constraints to the objective function, leading to intricate tuning processes or tailoring the solution to the specific social scenario. Machine Learning can enhance planners' versatility and help them learn complex social behaviors from data. This work proposes an adaptive social planner, using a Deep Reinforcement Learning agent to dynamically adjust the weighting parameters of the cost function used to evaluate trajectories. The resulting planner combines the robustness of the classic Dynamic Window Approach, integrated with a social cost based on the Social Force Model, and the flexibility of learning methods to boost the overall performance on social navigation tasks. Our extensive experimentation on different environments demonstrates the general advantage of the proposed method over static cost planners.

Abstract (translated)

人类感知导航对于移动机器人来说是一个复杂的任务，需要具备自主导航系统，能够实现有效的路径规划和社会兼容的行为。社会规划者通常会在目标函数中添加成本或约束，导致复杂的调整过程，或者将解决方案针对特定的社会场景进行定制。机器学习可以通过增强规划器的灵活性，帮助他们从数据中学习复杂的社会行为。本文提出了一种自适应的社会规划器，使用基于深度强化学习的代理动态调整用于评估轨迹的成本函数的权重参数。该规划器将经典动态窗口方法的稳健性与社会基于社会力模型的社会成本相结合，并采用学习方法的灵活性来提高社交导航任务的总体表现。我们对不同环境进行的实验证明，与静态成本规划器相比，所提出的方法具有很大的优越性。

URL

https://arxiv.org/abs/2404.13678

PDF

https://arxiv.org/pdf/2404.13678.pdf
Read All

Content

Autonomous (20)

Autonomous

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL

PDF

Abstract

Abstract (translated)

URL