Paper Reading AI Learner

Dedge-AGMNet: A Robust Multi-task Learning Network for Stereo Matching and Depth Edge Detection

2019-10-28 08:43:47
Weida Yang

Abstract

Recently, end-to-end convolutional neural networks have achieved remarkable success in disparity estimation tasks. However, these neural networks usually have difficulty in finding correct correspondences in ill-posed regions, such as texture-less areas, edge details, and small objects. This paper proposes an atrous granular multi-scale network based on depth edge subnetwork(Dedge-AGMNet) to overcome the difficulty above. This work has the following contributions. On one hand, the devised depth edge subnetwork provides the geometric knowledge and depth edge constraints. To incorporate the depth edge cues efficiently, the depth edge-spatial pyramid pooling(Dedge-SPP) module fuses the depth edge features to the disparity estimation branch. And the loss functions are proposed respectively for supervised and unsupervised tasks, which can improve the adaptability of the depth edge auxiliary network. On the other, the designed granular convolution is very suitable for constructing the atrous granular multi-scale (AGM) module. AGM module could capture multi-scale context information that requires fewer parameters and consumes fewer computational resources. In summary,the depth edge cues and multi-scale context information are both beneficial to explore potential corresponding points in ill-posed regions. Integrating the ranks of different stereo datasets, our network outperforms other stereo matching networks and shows very strong robustness for different environments. The Dedge-AGMNet advances state-of-the-art performances on the Sceneflow, KITTI 2012 and KITTI 2015 benchmark datasets.

Abstract (translated)

URL

https://arxiv.org/abs/1908.09346

PDF

https://arxiv.org/pdf/1908.09346.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot