Paper Reading AI Learner

Any-to-All MRI Synthesis: A Unified Foundation Model for Nasopharyngeal Carcinoma and Its Downstream Applications

2026-02-09 15:56:53
Yao Pu, Yiming Shi, Zhenxi Zhang, Peixin Yu, Yitao Zhuang, Xiang Wang, Hongzhao Chen, Jing Cai, Ge Ren

Abstract

Magnetic resonance imaging (MRI) is essential for nasopharyngeal carcinoma (NPC) radiotherapy (RT), but practical constraints, such as patient discomfort, long scan times, and high costs often lead to incomplete modalities in clinical practice, compromising RT planning accuracy. Traditional MRI synthesis methods are modality-specific, limited in anatomical adaptability, and lack clinical interpretability-failing to meet NPC's RT needs. Here, we developed a unified foundation model integrating contrastive visual representation learning and vision-language alignment (VLA) to enable any-to-all MRI synthesis. The model uses a contrastive encoder for modality-invariant representations and a CLIP-based text-informed decoder for semantically consistent synthesis, supporting any-to-all MRI synthesis via one unified foundation model. Trained on 40,825 images from 13 institutions, it achieves consistently high performance (average SSIM 0.90, PSNR 27) across 26 internal/external validation sites (15,748 images), with superior synthesis fidelity and robustness to noise and domain shifts. Meanwhile, its unified representation enhances downstream RT-relevant tasks (e.g., segmentation). This work advances digital medicine solutions for NPC care by leveraging foundation models to bridge technical synthesis and clinical utility.

Abstract (translated)

磁共振成像(MRI)对于鼻咽癌(NPC)放射治疗(RT)至关重要,但实际限制如患者不适、长时间的扫描时间和高昂的成本往往导致临床实践中模式不完整,从而影响了RT计划的准确性。传统的MRI合成方法特定于某种模态,且在解剖适应性和临床解释性方面存在局限,无法满足NPC RT的需求。 为此,我们开发了一种结合对比视觉表示学习和视觉-语言对齐(VLA)的统一基础模型,以实现任意到所有模式的MRI合成。该模型利用对比编码器生成模态不变的表示,并采用基于CLIP的方法进行语义一致的解码,支持通过一个统一的基础模型来完成任何到所有MRI模式的转换。此模型在来自13个机构的40,825张图像上进行了训练,在包括内外部验证在内的26个站点(总计15,748张图像)上实现了稳定的高性能(平均SSIM 0.90,PSNR 27),同时具备优异的合成保真度和对噪声及领域变化的强大鲁棒性。此外,该模型的统一表示方式还增强了下游RT相关的任务(如分割任务)。这项工作通过利用基础模型来弥合技术合成与临床实用性的差距,推进了数字医学解决方案在NPC护理中的应用。

URL

https://arxiv.org/abs/2602.08822

PDF

https://arxiv.org/pdf/2602.08822.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot