Paper Reading AI Learner

SuperEdge: Towards a Generalization Model for Self-Supervised Edge Detection

2024-01-04 15:21:53
Leng Kai, Zhang Zhijie, Liu Jie, Zed Boukhers, Sui Wei, Cong Yang, Li Zhijun

Abstract

Edge detection is a fundamental technique in various computer vision tasks. Edges are indeed effectively delineated by pixel discontinuity and can offer reliable structural information even in textureless areas. State-of-the-art heavily relies on pixel-wise annotations, which are labor-intensive and subject to inconsistencies when acquired manually. In this work, we propose a novel self-supervised approach for edge detection that employs a multi-level, multi-homography technique to transfer annotations from synthetic to real-world datasets. To fully leverage the generated edge annotations, we developed SuperEdge, a streamlined yet efficient model capable of concurrently extracting edges at pixel-level and object-level granularity. Thanks to self-supervised training, our method eliminates the dependency on manual annotated edge labels, thereby enhancing its generalizability across diverse datasets. Comparative evaluations reveal that SuperEdge advances edge detection, demonstrating improvements of 4.9% in ODS and 3.3% in OIS over the existing STEdge method on BIPEDv2.

Abstract (translated)

边缘检测是各种计算机视觉任务中的基本技术。事实上,通过像素不连续性有效地描绘边缘可以提供与纹理无关区域中的可靠结构信息。先进的边缘检测方法主要依赖于像素级的注释,这些注释工作量大且容易出现不一致。在本文中,我们提出了一种新颖的自监督边缘检测方法,该方法采用多级多同构技术将注释从合成数据集转移至现实世界数据集。为了充分利用生成的边缘注释,我们开发了SuperEdge,一种在像素级和物体级别上同时提取边缘的流线型高效模型。由于自监督训练,我们的方法消除了对手动注释边缘标签的依赖,从而提高了其对不同数据集的泛化能力。比较评估显示,SuperEdge在边缘检测方面取得了改进,其ODS提高了4.9%,OIS提高了3.3%,超过了现有STEdge方法。

URL

https://arxiv.org/abs/2401.02313

PDF

https://arxiv.org/pdf/2401.02313.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot