Paper Reading AI Learner

Zero-Shot Edge Detection with SCESAME: Spectral Clustering-based Ensemble for Segment Anything Model Estimation

2023-08-26 06:19:59
Hiroaki Yamagiwa, Yusuke Takase, Hiroyuki Kambe, Ryosuke Nakamoto

Abstract

This paper proposes a novel zero-shot edge detection with SCESAME, which stands for Spectral Clustering-based Ensemble for Segment Anything Model Estimation, based on the recently proposed Segment Anything Model (SAM). SAM is a foundation model for segmentation tasks, and one of the interesting applications of SAM is Automatic Mask Generation (AMG), which generates zero-shot segmentation masks of an entire image. AMG can be applied to edge detection, but suffers from the problem of overdetecting edges. Edge detection with SCESAME overcomes this problem by three steps: (1) eliminating small generated masks, (2) combining masks by spectral clustering, taking into account mask positions and overlaps, and (3) removing artifacts after edge detection. We performed edge detection experiments on two datasets, BSDS500 and NYUDv2. Although our zero-shot approach is simple, the experimental results on BSDS500 showed almost identical performance to human performance and CNN-based methods from seven years ago. In the NYUDv2 experiments, it performed almost as well as recent CNN-based methods. These results indicate that our method has the potential to be a strong baseline for future zero-shot edge detection methods. Furthermore, SCESAME is not only applicable to edge detection, but also to other downstream zero-shot tasks.

Abstract (translated)

这篇文章提出了一种新的零样本边缘检测方法,名为SCESAME,其含义是基于最近提出的Segment Anything Model(SAM)的Spectral Clustering-based Ensemble,用于分割任意模型估计。SAM是一个用于分割任务的基元模型,其中SAM的一个有趣的应用是自动分割掩码生成(AMG),该方法生成整个图像的零样本分割掩码。AMG可以应用于边缘检测,但存在过度检测边缘的问题。通过三个步骤克服了这个问题:(1)消除生成的小掩码;(2)通过Spectral Clustering结合掩码,考虑掩码的位置和重叠;(3)在边缘检测后删除痕迹。我们在两个数据集上进行了边缘检测实验,分别是BSDS500和NYUDv2。尽管我们的零样本方法很简单,在BSDS500上的实验结果与七年前人类表现和CNN方法几乎相同。在NYUDv2上,它几乎与最近的CNN方法表现相同。这些结果表明,我们的方法有成为未来零样本边缘检测方法的强大基准的潜力。此外,SCESAME不仅可以应用于边缘检测,还可以应用于其他零样本后续任务。

URL

https://arxiv.org/abs/2308.13779

PDF

https://arxiv.org/pdf/2308.13779.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot