Paper Reading AI Learner

Scene Grammars, Factor Graphs, and Belief Propagation

2018-07-30 22:44:27
Jeroen Chua, Pedro F. Felzenszwalb

Abstract

We describe a general framework for probabilistic modeling of complex scenes and inference from ambiguous observations. The approach is motivated by applications in image analysis and is based on the use of priors defined by stochastic grammars. We define a class of grammars that capture relationships between the objects in a scene and provide important contextual cues for statistical inference. The distribution over scenes defined by a probabilistic scene grammar can be represented by a graphical model and this construction can be used for efficient inference with loopy belief propagation. We show experimental results with two different applications. One application involves the reconstruction of binary contour maps. Another application involves detecting and localizing faces in images. In both applications the same framework leads to robust inference algorithms that can effectively combine local information to reason about a scene.

Abstract (translated)

我们描述了复杂场景的概率建模和模糊观察的推理的一般框架。该方法的动机是图像分析中的应用,并且基于随机语法定义的先验的使用。我们定义了一类语法,它捕获场景中对象之间的关系,并为统计推断提供重要的上下文线索。由概率场景语法定义的场景上的分布可以由图形模型表示,并且该构造可以用于利用循环信念传播的有效推断。  我们用两种不同的应用展示实验结果。一个应用涉及二进制等值线图的重建。另一个应用涉及检测和定位图像中的面部。在两个应用程序中,相同的框架导致强大的推理算法,其可以有效地将本地信息与场景的推理相结合。

URL

https://arxiv.org/abs/1606.01307

PDF

https://arxiv.org/pdf/1606.01307.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot