Paper Reading AI Learner

A Multi-objective Optimization Benchmark Test Suite for Real-time Semantic Segmentation

2024-04-25 00:30:03
Yifan Zhao, Zhenyu Liang, Zhichao Lu, Ran Cheng

Abstract

As one of the emerging challenges in Automated Machine Learning, the Hardware-aware Neural Architecture Search (HW-NAS) tasks can be treated as black-box multi-objective optimization problems (MOPs). An important application of HW-NAS is real-time semantic segmentation, which plays a pivotal role in autonomous driving scenarios. The HW-NAS for real-time semantic segmentation inherently needs to balance multiple optimization objectives, including model accuracy, inference speed, and hardware-specific considerations. Despite its importance, benchmarks have yet to be developed to frame such a challenging task as multi-objective optimization. To bridge the gap, we introduce a tailored streamline to transform the task of HW-NAS for real-time semantic segmentation into standard MOPs. Building upon the streamline, we present a benchmark test suite, CitySeg/MOP, comprising fifteen MOPs derived from the Cityscapes dataset. The CitySeg/MOP test suite is integrated into the EvoXBench platform to provide seamless interfaces with various programming languages (e.g., Python and MATLAB) for instant fitness evaluations. We comprehensively assessed the CitySeg/MOP test suite on various multi-objective evolutionary algorithms, showcasing its versatility and practicality. Source codes are available at this https URL.

Abstract (translated)

作为自动机器学习领域新兴挑战之一,硬件感知的神经架构搜索(HW-NAS)任务可以被视为多目标优化问题(MOPs)。HW-NAS在实时语义分割(Real-time Semantic Segmentation,RSS)中的应用至关重要。为实现实时语义分割,HW-NAS在实时语义分割本身就需要在多个优化目标之间取得平衡,包括模型准确性、推理速度和硬件特定考虑。尽管HW-NAS在实时语义分割中具有重要性,但迄今为止还没有为这样的具有挑战性的任务开发基准。为了填补这一空白,我们引入了一个定制的流线来将HW-NAS在实时语义分割中的任务转化为标准的MOP。在此基础上,我们提出了一个基准测试套件,CitySeg/MOP,包含来自Cityscapes数据集的15个MOP。CitySeg/MOP测试套件已集成到EvoXBench平台中,为各种编程语言(例如Python和MATLAB)提供了一个无缝的界面来进行即时的健身评估。我们对CitySeg/MOP测试套件进行了全面评估,展示了其多才性和实用性。源代码可以从此链接获取:https://url.cn/xyz6uJ4

URL

https://arxiv.org/abs/2404.16266

PDF

https://arxiv.org/pdf/2404.16266.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot