Abstract
As human-machine interaction continues to evolve, the capacity for environmental perception is becoming increasingly crucial. Integrating the two most common types of sensory data, images, and point clouds, can enhance detection accuracy. However, currently, no model exists that can simultaneously detect an object's position in both point clouds and images and ascertain their corresponding relationship. This information is invaluable for human-machine interactions, offering new possibilities for their enhancement. In light of this, this paper introduces an end-to-end Consistency Object Detection (COD) algorithm framework that requires only a single forward inference to simultaneously obtain an object's position in both point clouds and images and establish their correlation. Furthermore, to assess the accuracy of the object correlation between point clouds and images, this paper proposes a new evaluation metric, Consistency Precision (CP). To verify the effectiveness of the proposed framework, an extensive set of experiments has been conducted on the KITTI and DAIR-V2X datasets. The study also explored how the proposed consistency detection method performs on images when the calibration parameters between images and point clouds are disturbed, compared to existing post-processing methods. The experimental results demonstrate that the proposed method exhibits excellent detection performance and robustness, achieving end-to-end consistency detection. The source code will be made publicly available at this https URL.
Abstract (translated)
随着人机交互的不断进化,环境感知的能力变得越来越重要。将两种最常见的感官数据,图像和点云,集成在一起可以提高检测精度。然而,目前尚无同时检测物体在点云和图像中的位置以及确定它们相应关系的产品。这些信息对于人机交互非常重要,为它们的增强提供了新的可能性。因此,本文介绍了一个端到端的一致性物体检测(COD)算法框架,只需要一个前向推理即可同时获得物体在点云和图像中的位置并确定它们之间的关联。此外,为了评估点云和图像之间物体关联的准确性,本文提出了一个新的评估指标,一致性精度(CP)。为了验证所提出的框架的有效性,在KITTI和DAIR-V2X数据集上进行了大量实验。研究还探讨了在图像和点云之间的校准参数受到干扰时,所提出的一致性检测方法在图片上的表现,与现有后处理方法进行了比较。实验结果表明,与现有方法相比,所提出的检测方法具有卓越的检测性能和鲁棒性,实现了端到端的一致性检测。源代码将公开在https://这个网址上。
URL
https://arxiv.org/abs/2405.01258