Paper Reading AI Learner

Residual Dense Swin Transformer for Continuous Depth-Independent Ultrasound Imaging

2024-03-25 03:01:53
Jintong Hu, Hui Che, Zishuo Li, Wenming Yang

Abstract

Ultrasound imaging is crucial for evaluating organ morphology and function, yet depth adjustment can degrade image quality and field-of-view, presenting a depth-dependent dilemma. Traditional interpolation-based zoom-in techniques often sacrifice detail and introduce artifacts. Motivated by the potential of arbitrary-scale super-resolution to naturally address these inherent challenges, we present the Residual Dense Swin Transformer Network (RDSTN), designed to capture the non-local characteristics and long-range dependencies intrinsic to ultrasound images. It comprises a linear embedding module for feature enhancement, an encoder with shifted-window attention for modeling non-locality, and an MLP decoder for continuous detail reconstruction. This strategy streamlines balancing image quality and field-of-view, which offers superior textures over traditional methods. Experimentally, RDSTN outperforms existing approaches while requiring fewer parameters. In conclusion, RDSTN shows promising potential for ultrasound image enhancement by overcoming the limitations of conventional interpolation-based methods and achieving depth-independent imaging.

Abstract (translated)

超声成像对评价器官形态和功能至关重要,但深度调整会降低图像质量和视野范围,呈现出深度相关的困境。传统的基于插值的方法通常会牺牲细节并引入伪影。鉴于任意尺度超分辨率的自然解决这些固有挑战的潜力,我们提出了残余密集辛普森变换网络(RDSTN),旨在捕捉超声图像的非局部特征和长距离依赖关系。它包括一个用于特征增强的线性嵌入模块、一个具有平移窗口注意力的编码器和一个用于连续细节重构的MLP解码器。这种策略在平衡图像质量和视野范围方面取得了优越的 texture,超过了传统方法。实验证明,RDSTN在性能上优于现有方法,同时需要的参数更少。总之,RDSTN通过克服传统插值方法的局限,为超声图像增强展示了有前景的可能性,实现了无深度依赖的图像。

URL

https://arxiv.org/abs/2403.16384

PDF

https://arxiv.org/pdf/2403.16384.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot