Paper Reading AI Learner

Robotic Surgery Remote Mentoring via AR with 3D Scene Streaming and Hand Interaction

2022-04-09 03:17:15
Yonghao Long, Chengkun Li, Qi Dou

Abstract

With the growing popularity of robotic surgery, education becomes increasingly important and urgently needed for the sake of patient safety. However, experienced surgeons have limited accessibility due to their busy clinical schedule or working in a distant city, thus can hardly provide sufficient education resources for novices. Remote mentoring, as an effective way, can help solve this problem, but traditional methods are limited to plain text, audio, or 2D video, which are not intuitive nor vivid. Augmented reality (AR), a thriving technique being widely used for various education scenarios, is promising to offer new possibilities of visual experience and interactive teaching. In this paper, we propose a novel AR-based robotic surgery remote mentoring system with efficient 3D scene visualization and natural 3D hand interaction. Using a head-mounted display (i.e., HoloLens), the mentor can remotely monitor the procedure streamed from the trainee's operation side. The mentor can also provide feedback directly with hand gestures, which is in-turn transmitted to the trainee and viewed in surgical console as guidance. We comprehensively validate the system on both real surgery stereo videos and ex-vivo scenarios of common robotic training tasks (i.e., peg-transfer and suturing). Promising results are demonstrated regarding the fidelity of streamed scene visualization, the accuracy of feedback with hand interaction, and the low-latency of each component in the entire remote mentoring system. This work showcases the feasibility of leveraging AR technology for reliable, flexible and low-cost solutions to robotic surgical education, and holds great potential for clinical applications.

Abstract (translated)

URL

https://arxiv.org/abs/2204.04377

PDF

https://arxiv.org/pdf/2204.04377.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot