Paper Reading AI Learner

H2ASeg: Hierarchical Adaptive Interaction and Weighting Network for Tumor Segmentation in PET/CT Images

2024-03-27 08:28:14
Jinpeng Lu, Jingyun Chen, Linghan Cai, Songhan Jiang, Yongbing Zhang

Abstract

Positron emission tomography (PET) combined with computed tomography (CT) imaging is routinely used in cancer diagnosis and prognosis by providing complementary information. Automatically segmenting tumors in PET/CT images can significantly improve examination efficiency. Traditional multi-modal segmentation solutions mainly rely on concatenation operations for modality fusion, which fail to effectively model the non-linear dependencies between PET and CT modalities. Recent studies have investigated various approaches to optimize the fusion of modality-specific features for enhancing joint representations. However, modality-specific encoders used in these methods operate independently, inadequately leveraging the synergistic relationships inherent in PET and CT modalities, for example, the complementarity between semantics and structure. To address these issues, we propose a Hierarchical Adaptive Interaction and Weighting Network termed H2ASeg to explore the intrinsic cross-modal correlations and transfer potential complementary information. Specifically, we design a Modality-Cooperative Spatial Attention (MCSA) module that performs intra- and inter-modal interactions globally and locally. Additionally, a Target-Aware Modality Weighting (TAMW) module is developed to highlight tumor-related features within multi-modal features, thereby refining tumor segmentation. By embedding these modules across different layers, H2ASeg can hierarchically model cross-modal correlations, enabling a nuanced understanding of both semantic and structural tumor features. Extensive experiments demonstrate the superiority of H2ASeg, outperforming state-of-the-art methods on AutoPet-II and Hecktor2022 benchmarks. The code is released at this https URL.

Abstract (translated)

利用正电子发射断层扫描(PET)与计算机断层扫描(CT)成像相结合可以提供互补信息,从而进行癌症诊断和预后。自动分割PET/CT图像中的肿瘤可以显著提高检查效率。传统的多模态分割解决方案主要依赖于模式串接操作进行模式融合,但这些方法无法有效地建模PET和CT模式之间的非线性依赖关系。最近的研究调查了各种方法来优化对模态特定特征的融合以增强联合表示。然而,用于这些方法的传统模态编码器是相互独立的,未能充分利用PET和CT模式固有的协同关系,例如语义和结构之间的互补性。为解决这些问题,我们提出了一个名为H2ASeg的自分层自适应交互加权网络,以探索固有跨模态关联和传递信息。具体来说,我们设计了一个模态合作空间注意(MCSA)模块,它在全球和局部进行模态交互。还开发了一个目标感知模式加权(TAMW)模块,以突出多模态特征中的肿瘤相关特征,从而改善肿瘤分割。通过在不同的层中嵌入这些模块,H2ASeg可以自层次建模跨模态关联,从而实现对语义和结构肿瘤特征的深入理解。大量实验证明,H2ASeg在AutoPet-II和Hecktor2022基准测试中的优越性,超过了最先进的方法。代码发布在https://这个URL上。

URL

https://arxiv.org/abs/2403.18339

PDF

https://arxiv.org/pdf/2403.18339.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot