Paper Reading AI Learner

An objective comparison of methods for augmented reality in laparoscopic liver resection by preoperative-to-intraoperative image fusion

2024-01-28 20:30:14
Sharib Ali, Yamid Espinel, Yueming Jin, Peng Liu, Bianca Güttner, Xukun Zhang, Lihua Zhang, Tom Dowrick, Matthew J. Clarkson, Shiting Xiao, Yifan Wu, Yijun Yang, Lei Zhu, Dai Sun, Lan Li, Micha Pfeiffer, Shahid Farid, Lena Maier-Hein, Emmanuel Buc, Adrien Bartoli

Abstract

Augmented reality for laparoscopic liver resection is a visualisation mode that allows a surgeon to localise tumours and vessels embedded within the liver by projecting them on top of a laparoscopic image. Preoperative 3D models extracted from CT or MRI data are registered to the intraoperative laparoscopic images during this process. In terms of 3D-2D fusion, most of the algorithms make use of anatomical landmarks to guide registration. These landmarks include the liver's inferior ridge, the falciform ligament, and the occluding contours. They are usually marked by hand in both the laparoscopic image and the 3D model, which is time-consuming and may contain errors if done by a non-experienced user. Therefore, there is a need to automate this process so that augmented reality can be used effectively in the operating room. We present the Preoperative-to-Intraoperative Laparoscopic Fusion Challenge (P2ILF), held during the Medical Imaging and Computer Assisted Interventions (MICCAI 2022) conference, which investigates the possibilities of detecting these landmarks automatically and using them in registration. The challenge was divided into two tasks: 1) A 2D and 3D landmark detection task and 2) a 3D-2D registration task. The teams were provided with training data consisting of 167 laparoscopic images and 9 preoperative 3D models from 9 patients, with the corresponding 2D and 3D landmark annotations. A total of 6 teams from 4 countries participated, whose proposed methods were evaluated on 16 images and two preoperative 3D models from two patients. All the teams proposed deep learning-based methods for the 2D and 3D landmark segmentation tasks and differentiable rendering-based methods for the registration task. Based on the experimental outcomes, we propose three key hypotheses that determine current limitations and future directions for research in this domain.

Abstract (translated)

增强现实在腹腔镜肝切除手术中是一种可视化模式,允许外科医生通过将肿瘤和血管植入肝脏的图像投影在腹腔镜图像上,来定位肝脏内的肿瘤和血管。在这个过程中,通过CT或MRI数据提取的前期3D模型与腹腔镜活检图像进行配准。在3D-2D融合方面,大多数算法利用解剖标志来指导配准。这些标志包括肝的低位脊、肝钩状韧带和遮盖轮廓。它们通常在腹腔镜图像和3D模型上用手标出,这需要花费时间,并且如果由非经验丰富的用户完成,可能会包含错误。因此,需要自动化这个过程,以便增强现实在手术室中得到有效利用。我们在医学影像和计算机辅助干预(MICCAI 2022)会议期间举办了预术前到术中腹腔镜融合挑战(P2ILF)比赛,研究了自动检测这些标志并使用它们进行配准的可能性。挑战分为两个任务:1)2D和3D地标检测任务和2)3D-2D配准任务。来自4个国家的6支队伍参加了比赛,他们的方法根据16个腹腔镜图像和2个病人体前3D模型的2D和3D地标注释进行了评估。所有队伍都提出了基于深度学习的地标分割方法和基于可分离渲染的地标配准方法。根据实验结果,我们提出了三个关键假设,决定了这个领域当前的局限性和未来的研究方向。

URL

https://arxiv.org/abs/2401.15753

PDF

https://arxiv.org/pdf/2401.15753.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot