Paper Reading AI Learner

RGB Guided ToF Imaging System: A Survey of Deep Learning-based Methods

2024-05-16 17:59:58
Xin Qiao, Matteo Poggi, Pengchao Deng, Hao Wei, Chenyang Ge, Stefano Mattoccia


Integrating an RGB camera into a ToF imaging system has become a significant technique for perceiving the real world. The RGB guided ToF imaging system is crucial to several applications, including face anti-spoofing, saliency detection, and trajectory prediction. Depending on the distance of the working range, the implementation schemes of the RGB guided ToF imaging systems are different. Specifically, ToF sensors with a uniform field of illumination, which can output dense depth but have low resolution, are typically used for close-range measurements. In contrast, LiDARs, which emit laser pulses and can only capture sparse depth, are usually employed for long-range detection. In the two cases, depth quality improvement for RGB guided ToF imaging corresponds to two sub-tasks: guided depth super-resolution and guided depth completion. In light of the recent significant boost to the field provided by deep learning, this paper comprehensively reviews the works related to RGB guided ToF imaging, including network structures, learning strategies, evaluation metrics, benchmark datasets, and objective functions. Besides, we present quantitative comparisons of state-of-the-art methods on widely used benchmark datasets. Finally, we discuss future trends and the challenges in real applications for further research.

Abstract (translated)

将RGB相机集成到ToF成像系统中已成为感知现实世界的重要技术。RGB引导ToF成像系统对于多个应用场景至关重要,包括面部抗伪造、轮廓检测和轨迹预测。根据工作范围的不同,RGB引导ToF成像系统的实现方案是不同的。具体来说,具有均匀场照明的ToF传感器通常用于近距离测量。相反,激光雷达,它们只能捕获稀疏深度,通常用于远距离检测。在这两种情况下,RGB引导ToF成像系统的深度质量改进相当于两个子任务:引导深度超分辨率 和引导深度完成。 鉴于最近深度学习在领域提供的重大提升,本文全面回顾了与RGB引导ToF成像相关的论文,包括网络结构、学习策略、评估指标、基准数据集和目标函数。此外,我们还在广泛使用的基准数据集上对最先进的方法进行了定量比较。最后,我们讨论了在实际应用中未来的趋势和挑战,为进一步研究提供了指导。



3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot