Paper Reading AI Learner

Instance-Level Relative Saliency Ranking with Graph Reasoning

2021-07-08 13:10:42
Nian Liu, Long Li, Wangbo Zhao, Junwei Han, Ling Shao

Abstract

Conventional salient object detection models cannot differentiate the importance of different salient objects. Recently, two works have been proposed to detect saliency ranking by assigning different degrees of saliency to different objects. However, one of these models cannot differentiate object instances and the other focuses more on sequential attention shift order inference. In this paper, we investigate a practical problem setting that requires simultaneously segment salient instances and infer their relative saliency rank order. We present a novel unified model as the first end-to-end solution, where an improved Mask R-CNN is first used to segment salient instances and a saliency ranking branch is then added to infer the relative saliency. For relative saliency ranking, we build a new graph reasoning module by combining four graphs to incorporate the instance interaction relation, local contrast, global contrast, and a high-level semantic prior, respectively. A novel loss function is also proposed to effectively train the saliency ranking branch. Besides, a new dataset and an evaluation metric are proposed for this task, aiming at pushing forward this field of research. Finally, experimental results demonstrate that our proposed model is more effective than previous methods. We also show an example of its practical usage on adaptive image retargeting.

Abstract (translated)

URL

https://arxiv.org/abs/2107.03824

PDF

https://arxiv.org/pdf/2107.03824.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot