Paper Reading AI Learner

A Spatial-Temporal Progressive Fusion Network for Breast Lesion Segmentation in Ultrasound Videos

2024-03-18 11:56:32
Zhengzheng Tu, Zigang Zhu, Yayang Duan, Bo Jiang, Qishun Wang, Chaoxue Zhang

Abstract

Ultrasound video-based breast lesion segmentation provides a valuable assistance in early breast lesion detection and treatment. However, existing works mainly focus on lesion segmentation based on ultrasound breast images which usually can not be adapted well to obtain desirable results on ultrasound videos. The main challenge for ultrasound video-based breast lesion segmentation is how to exploit the lesion cues of both intra-frame and inter-frame simultaneously. To address this problem, we propose a novel Spatial-Temporal Progressive Fusion Network (STPFNet) for video based breast lesion segmentation problem. The main aspects of the proposed STPFNet are threefold. First, we propose to adopt a unified network architecture to capture both spatial dependences within each ultrasound frame and temporal correlations between different frames together for ultrasound data representation. Second, we propose a new fusion module, termed Multi-Scale Feature Fusion (MSFF), to fuse spatial and temporal cues together for lesion detection. MSFF can help to determine the boundary contour of lesion region to overcome the issue of lesion boundary blurring. Third, we propose to exploit the segmentation result of previous frame as the prior knowledge to suppress the noisy background and learn more robust representation. In particular, we introduce a new publicly available ultrasound video breast lesion segmentation dataset, termed UVBLS200, which is specifically dedicated to breast lesion segmentation. It contains 200 videos, including 80 videos of benign lesions and 120 videos of malignant lesions. Experiments on the proposed dataset demonstrate that the proposed STPFNet achieves better breast lesion detection performance than state-of-the-art methods.

Abstract (translated)

超声视频为基础的乳腺癌病变分割提供了一个有价值的帮助,在早期乳腺癌病变检测和治疗中。然而,现有的作品主要关注基于超声乳房图像的病变分割,这些方法通常不能很好地适应超声视频上的期望结果。超声视频为基础的乳腺癌病变分割的主要挑战是如何同时利用帧内和帧间的病变线索。为了解决这个问题,我们提出了一个名为STPFNet的超声视频乳腺癌病变分割网络。STPFNet的主要方面是三点。首先,我们提出了一种统一网络架构,以捕捉每个超声帧内的空间相关性以及不同帧之间的时序相关性,实现超声数据的表示。其次,我们提出了一种新的融合模块,称为多尺度特征融合(MSFF),用于将空间和时间线索一起融合用于病变检测。MSFF可以帮助确定病变区域的边界轮廓,克服病变边界模糊的问题。第三,我们利用前帧的分割结果作为先验知识来抑制噪声背景,并学习更鲁棒的表示。特别地,我们引入了一个新的公开可用的超声视频乳腺癌病变分割数据集,称为UVBLS200,该数据集专门用于乳腺癌病变分割。它包括200个视频,包括80个良性病变和120个恶性病变。对于提出的数据集的实验证明,与最先进的乳腺癌病变检测方法相比,STPFNet具有更好的性能。

URL

https://arxiv.org/abs/2403.11699

PDF

https://arxiv.org/pdf/2403.11699.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot