H2ASeg: Hierarchical Adaptive Interaction and Weighting Network for Tumor Segmentation in PET/CT Images

Abstract
Abstract (translated)
URL
PDF

Abstract

Positron emission tomography (PET) combined with computed tomography (CT) imaging is routinely used in cancer diagnosis and prognosis by providing complementary information. Automatically segmenting tumors in PET/CT images can significantly improve examination efficiency. Traditional multi-modal segmentation solutions mainly rely on concatenation operations for modality fusion, which fail to effectively model the non-linear dependencies between PET and CT modalities. Recent studies have investigated various approaches to optimize the fusion of modality-specific features for enhancing joint representations. However, modality-specific encoders used in these methods operate independently, inadequately leveraging the synergistic relationships inherent in PET and CT modalities, for example, the complementarity between semantics and structure. To address these issues, we propose a Hierarchical Adaptive Interaction and Weighting Network termed H2ASeg to explore the intrinsic cross-modal correlations and transfer potential complementary information. Specifically, we design a Modality-Cooperative Spatial Attention (MCSA) module that performs intra- and inter-modal interactions globally and locally. Additionally, a Target-Aware Modality Weighting (TAMW) module is developed to highlight tumor-related features within multi-modal features, thereby refining tumor segmentation. By embedding these modules across different layers, H2ASeg can hierarchically model cross-modal correlations, enabling a nuanced understanding of both semantic and structural tumor features. Extensive experiments demonstrate the superiority of H2ASeg, outperforming state-of-the-art methods on AutoPet-II and Hecktor2022 benchmarks. The code is released at this https URL.

Abstract (translated)

利用正电子发射断层扫描（PET）与计算机断层扫描（CT）成像相结合可以提供互补信息，从而进行癌症诊断和预后。自动分割PET/CT图像中的肿瘤可以显著提高检查效率。传统的多模态分割解决方案主要依赖于模式串接操作进行模式融合，但这些方法无法有效地建模PET和CT模式之间的非线性依赖关系。最近的研究调查了各种方法来优化对模态特定特征的融合以增强联合表示。然而，用于这些方法的传统模态编码器是相互独立的，未能充分利用PET和CT模式固有的协同关系，例如语义和结构之间的互补性。为解决这些问题，我们提出了一个名为H2ASeg的自分层自适应交互加权网络，以探索固有跨模态关联和传递信息。具体来说，我们设计了一个模态合作空间注意（MCSA）模块，它在全球和局部进行模态交互。还开发了一个目标感知模式加权（TAMW）模块，以突出多模态特征中的肿瘤相关特征，从而改善肿瘤分割。通过在不同的层中嵌入这些模块，H2ASeg可以自层次建模跨模态关联，从而实现对语义和结构肿瘤特征的深入理解。大量实验证明，H2ASeg在AutoPet-II和Hecktor2022基准测试中的优越性，超过了最先进的方法。代码发布在https://这个URL上。

URL

https://arxiv.org/abs/2403.18339

PDF

https://arxiv.org/pdf/2403.18339.pdf