Abstract
Visual place recognition (VPR) enables autonomous robots to identify previously visited locations, which contributes to tasks like simultaneous localization and mapping (SLAM). VPR faces challenges such as accurate image neighbor retrieval and appearance change in scenery. Event cameras, also known as dynamic vision sensors, are a new sensor modality for VPR and offer a promising solution to the challenges with their unique attributes: high temporal resolution (1MHz clock), ultra-low latency (in {\mu}s), and high dynamic range (>120dB). These attributes make event cameras less susceptible to motion blur and more robust in variable lighting conditions, making them suitable for addressing VPR challenges. However, the scarcity of event-based VPR datasets, partly due to the novelty and cost of event cameras, hampers their adoption. To fill this data gap, our paper introduces the NYC-Event-VPR dataset to the robotics and computer vision communities, featuring the Prophesee IMX636 HD event sensor (1280x720 resolution), combined with RGB camera and GPS module. It encompasses over 13 hours of geotagged event data, spanning 260 kilometers across New York City, covering diverse lighting and weather conditions, day/night scenarios, and multiple visits to various locations. Furthermore, our paper employs three frameworks to conduct generalization performance assessments, promoting innovation in event-based VPR and its integration into robotics applications.
Abstract (translated)
视觉地点识别(VPR)使自主机器人能够识别先前访问过的地点,这对同时定位与地图构建(SLAM)等任务有贡献。VPR面临诸如准确的图像邻域检索和场景外观变化等挑战。事件相机,也称为动态视觉传感器,是VPR的一种新传感模式,并以其独特的属性为解决这些挑战提供了有前景的解决方案:高时间分辨率(1MHz时钟)、超低延迟(微秒级)和宽广的动态范围(>120dB)。这些特性使事件相机在运动模糊方面具有更强的抗性,并能在各种光照条件下保持稳定,使其适合应对VPR中的挑战。然而,由于事件相机的新颖性和成本,基于事件的VPR数据集稀缺阻碍了其应用。为填补这一数据缺口,我们的论文向机器人和计算机视觉社区介绍了NYC-Event-VPR数据集,该数据集配备了Prophesee IMX636 HD事件传感器(1280x720分辨率),并结合RGB相机和GPS模块。它包含了超过13小时的地理标记事件数据,覆盖纽约市长达260公里的距离,涉及多样的光照和天气条件、昼夜场景以及对不同地点的多次访问。此外,我们的论文采用三种框架来评估泛化性能,促进基于事件VPR的创新及其在机器人应用中的集成。
URL
https://arxiv.org/abs/2410.21615