Paper Reading AI Learner

SalFAU-Net: Saliency Fusion Attention U-Net for Salient Object Detection

2024-05-05 12:11:33
Kassaw Abraham Mulat, Zhengyong Feng, Tegegne Solomon Eshetie, Ahmed Endris Hasen

Abstract

Salient object detection (SOD) remains an important task in computer vision, with applications ranging from image segmentation to autonomous driving. Fully convolutional network (FCN)-based methods have made remarkable progress in visual saliency detection over the last few decades. However, these methods have limitations in accurately detecting salient objects, particularly in challenging scenes with multiple objects, small objects, or objects with low resolutions. To address this issue, we proposed a Saliency Fusion Attention U-Net (SalFAU-Net) model, which incorporates a saliency fusion module into each decoder block of the attention U-net model to generate saliency probability maps from each decoder block. SalFAU-Net employs an attention mechanism to selectively focus on the most informative regions of an image and suppress non-salient regions. We train SalFAU-Net on the DUTS dataset using a binary cross-entropy loss function. We conducted experiments on six popular SOD evaluation datasets to evaluate the effectiveness of the proposed method. The experimental results demonstrate that our method, SalFAU-Net, achieves competitive performance compared to other methods in terms of mean absolute error (MAE), F-measure, s-measure, and e-measure.

Abstract (translated)

突出物体检测(SOD)在计算机视觉中仍然是一个重要的任务,其应用范围从图像分割到自动驾驶。在过去的几十年里,完全卷积网络(FCN)为基础的方法在视觉突出物体检测方面取得了显著进展。然而,这些方法在准确检测突出物体方面存在局限性,特别是在具有多个物体、小物体或低分辨率物体的挑战性场景中。为解决这个问题,我们提出了一个突出物体融合注意U-Net(SalFAU-Net)模型,该模型将突出物体融合模块融入每个注意U-Net模型的解码块中,从每个解码块生成突出物体概率图。SalFAU-Net采用关注机制选择性地关注图像中最具信息性的区域,并抑制非突出物体区域。我们使用二元交叉熵损失函数在DUTS数据集上训练SalFAU-Net。我们对六个流行的SOD评估数据集进行了实验,以评估所提出方法的有效性。实验结果表明,与其它方法相比,我们的方法SalFAU-Net在平均绝对误差(MAE)、F-分数、s-分数和e-分数方面具有竞争力的性能。

URL

https://arxiv.org/abs/2405.02906

PDF

https://arxiv.org/pdf/2405.02906.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot