Paper Reading AI Learner

Style-Extracting Diffusion Models for Semi-Supervised Histopathology Segmentation

2024-03-21 14:36:59
Mathias \"Ottl, Frauke Wilm, Jana Steenpass, Jingna Qiu, Matthias R\"ubner, Arndt Hartmann, Matthias Beckmann, Peter Fasching, Andreas Maier, Ramona Erber, Bernhard Kainz, Katharina Breininger

Abstract

Deep learning-based image generation has seen significant advancements with diffusion models, notably improving the quality of generated images. Despite these developments, generating images with unseen characteristics beneficial for downstream tasks has received limited attention. To bridge this gap, we propose Style-Extracting Diffusion Models, featuring two conditioning mechanisms. Specifically, we utilize 1) a style conditioning mechanism which allows to inject style information of previously unseen images during image generation and 2) a content conditioning which can be targeted to a downstream task, e.g., layout for segmentation. We introduce a trainable style encoder to extract style information from images, and an aggregation block that merges style information from multiple style inputs. This architecture enables the generation of images with unseen styles in a zero-shot manner, by leveraging styles from unseen images, resulting in more diverse generations. In this work, we use the image layout as target condition and first show the capability of our method on a natural image dataset as a proof-of-concept. We further demonstrate its versatility in histopathology, where we combine prior knowledge about tissue composition and unannotated data to create diverse synthetic images with known layouts. This allows us to generate additional synthetic data to train a segmentation network in a semi-supervised fashion. We verify the added value of the generated images by showing improved segmentation results and lower performance variability between patients when synthetic images are included during segmentation training. Our code will be made publicly available at [LINK].

Abstract (translated)

基于深度学习的图像生成已经取得了显著的进步,特别是提高了生成图像的质量。尽管如此,利用未见特征生成图像对下游任务有益的研究仍然受到了很少的关注。为了填补这一空白,我们提出了Style-Extracting Diffusion Models,具有两个调节机制。具体来说,我们利用1)一种风格调节机制,允许在图像生成过程中注入以前未见图像的风格信息,以及2)一种内容调节机制,可以针对下游任务,例如分割的布局。我们引入了一个可训练的风格编码器来提取图像中的风格信息,和一个聚合块,用于合并多个风格输入的Style信息。这种架构使得可以在零散的视角下生成未见风格的图像,通过利用未见图像的风格信息,从而产生更加多样化的生成。 在这篇工作中,我们使用图像布局作为目标条件,首先证明我们的方法的可靠性。然后,我们在病理学领域进一步证明了其多才性,将以前关于组织组成的不确定知识与未标记数据相结合,生成了具有已知布局的多样合成图像。这使得我们可以在半监督方式下生成额外合成数据来训练分割网络。我们通过展示提高的分割结果和分割训练过程中患者之间的性能变异性来验证所生成图像的附加价值。我们的代码将公开发布在[LINK]上。

URL

https://arxiv.org/abs/2403.14429

PDF

https://arxiv.org/pdf/2403.14429.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot