1st Place Solution of The Robust Vision Challenge 2022 Semantic Segmentation Track

2022-10-23 20:52:22

Junfei Xiao, Zhichao Xu, Shiyi Lan, Zhiding Yu, Alan Yuille, Anima Anandkumar

arXiv_CV

arXiv_CV Segmentation Semantic_Segmentation

Abstract
Abstract (translated)
URL
PDF

Abstract

This report describes the winner solution to the semantic segmentation task of the Robust Vision Challenge on ECCV 2022. Our method adopts the FAN-B-Hybrid model as the encoder and uses Segformer as the segmentation framework. The model is trained on a combined dataset containing images from 9 datasets (ADE20K, Cityscapes, Mapillary Vistas, ScanNet, VIPER, Wilddash2, IDD, BDD, and COCO) with a simple dataset balancing strategy. All the original labels are projected to a 256-class unified label space, and the model is trained with naive cross-entropy loss. Without significant hyperparameters tuning or any specific loss weighting, our solution ranks 1st on all the required semantic segmentation benchmarks from multiple domains (ADE20K, Cityscapes, Mapillary Vistas, ScanNet, VIPER, and Wilddash2). Our method could be served as a strong baseline for the multi-domain segmentation task and our codebase could be helpful to future work. Code will be available at this https URL.

Abstract (translated)

URL

https://arxiv.org/abs/2210.12852

PDF

https://arxiv.org/pdf/2210.12852.pdf