OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks

2024-04-22 09:43:03

Sophia Sirko-Galouchenko, Alexandre Boulch, Spyros Gidaris, Andrei Bursuc, Antonin Vobecky, Patrick Pérez, Renaud Marlet

arXiv_CV

Abstract
Abstract (translated)
URL
PDF

Abstract

We introduce a self-supervised pretraining method, called OcFeat, for camera-only Bird's-Eye-View (BEV) segmentation networks. With OccFeat, we pretrain a BEV network via occupancy prediction and feature distillation tasks. Occupancy prediction provides a 3D geometric understanding of the scene to the model. However, the geometry learned is class-agnostic. Hence, we add semantic information to the model in the 3D space through distillation from a self-supervised pretrained image foundation model. Models pretrained with our method exhibit improved BEV semantic segmentation performance, particularly in low-data scenarios. Moreover, empirical results affirm the efficacy of integrating feature distillation with 3D occupancy prediction in our pretraining approach.

Abstract (translated)

我们提出了一种名为OcFeat的自监督预训练方法，用于相机仅鸟眼视（BEV）分割网络。通过OcFeat，我们通过占有预测和特征蒸馏任务预训练BEV网络。占有预测提供了场景的3D几何理解给模型。然而，学习到的几何是分类无关的。因此，我们在3D空间中通过自监督预训练图像基础模型进行语义信息添加。使用我们方法预训练的模型表现出 improved BEV语义分割性能，特别是在低数据场景中。此外，实验结果证实了在我们的预训练方法中整合特征蒸馏与3D占有预测的有效性。

URL

https://arxiv.org/abs/2404.14027

PDF

https://arxiv.org/pdf/2404.14027.pdf

OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks

Abstract

Abstract (translated)

URL

PDF Copy

PDF