Paper Reading AI Learner

Latent Anomaly Detection: Masked VQ-GAN for Unsupervised Segmentation in Medical CBCT

2025-06-17 05:58:04
Pengwei Wang

Abstract

Advances in treatment technology now allow for the use of customizable 3D-printed hydrogel wound dressings for patients with osteoradionecrosis (ORN) of the jaw (ONJ). Meanwhile, deep learning has enabled precise segmentation of 3D medical images using tools like nnUNet. However, the scarcity of labeled data in ONJ imaging makes supervised training impractical. This study aims to develop an unsupervised training approach for automatically identifying anomalies in imaging scans. We propose a novel two-stage training pipeline. In the first stage, a VQ-GAN is trained to accurately reconstruct normal subjects. In the second stage, random cube masking and ONJ-specific masking are applied to train a new encoder capable of recovering the data. The proposed method achieves successful segmentation on both simulated and real patient data. This approach provides a fast initial segmentation solution, reducing the burden of manual labeling. Additionally, it has the potential to be directly used for 3D printing when combined with hand-tuned post-processing.

Abstract (translated)

治疗技术的进步现在允许使用定制的3D打印水凝胶伤口敷料来治疗颌骨放射性坏死(ORN)患者。同时,深度学习使得可以利用像nnUNet这样的工具对3D医学影像进行精确分割。然而,在颌骨放射性坏死(ONJ)成像中缺乏标记数据使得监督训练变得不切实际。本研究旨在开发一种无监督的训练方法,以自动识别影像扫描中的异常情况。我们提出了一种新颖的两阶段训练管道。在第一阶段,通过训练一个VQ-GAN来精确重建正常受试者图像。在第二阶段,使用随机立方体掩码和专门针对ONJ的掩码训练一个新的编码器,使其能够恢复数据。该方法成功地对模拟和真实患者的影像进行了分割。这种方法提供了一种快速初始分割解决方案,减少了手动标记的工作量。此外,结合人工调整后的后处理步骤,它可以直接用于3D打印。

URL

https://arxiv.org/abs/2506.14209

PDF

https://arxiv.org/pdf/2506.14209.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot