Paper Reading AI Learner

Three Birds One Stone: A General Architecture for Salient Object Segmentation, Edge Detection and Skeleton Extraction

2019-04-06 02:31:04
Qibin Hou, Jiang-Jiang Liu, Ming-Ming Cheng, Ali Borji, Philip H.S. Torr

Abstract

In this paper, we aim at solving pixel-wise binary problems, including salient object segmentation, skeleton extraction, and edge detection, by introducing a unified architecture. Previous works have proposed tailored methods for solving each of the three tasks independently. Here, we show that these tasks share some similarities that can be exploited for developing a unified framework. In particular, we introduce a horizontal cascade, each component of which is densely connected to the outputs of previous component. Stringing these components together allows us to effectively exploit features across different levels hierarchically to effectively address the multiple pixel-wise binary regression tasks. To assess the performance of our proposed network on these tasks, we carry out exhaustive evaluations on multiple representative datasets. Although these tasks are inherently very different, we show that our unified approach performs very well on all of them and works far better than current single-purpose state-of-the-art methods. All the code in this paper will be publicly available.

Abstract (translated)

本文旨在通过引入一个统一的体系结构来解决像素级的二进制问题,包括突出的对象分割、骨架提取和边缘检测。以前的工作提出了独立解决这三个任务的量身定制方法。在这里,我们展示了这些任务具有一些相似性,可以利用这些相似性来开发一个统一的框架。特别地,我们引入一个水平级联,其中的每个组件都与前一个组件的输出紧密相连。将这些组件串在一起可以有效地利用不同层次的特性,从而有效地处理多像素的二进制回归任务。为了评估我们提出的网络在这些任务上的性能,我们对多个具有代表性的数据集进行了详尽的评估。尽管这些任务本质上是非常不同的,但我们表明,我们的统一方法在所有这些任务上都表现得非常好,并且比当前的单用途最先进的方法效果要好得多。本文中的所有代码都是公开的。

URL

https://arxiv.org/abs/1803.09860

PDF

https://arxiv.org/pdf/1803.09860.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot