Paper Reading AI Learner

Stereo 3D Object Trajectory Reconstruction

2018-08-27 15:04:04
Sebastian Bullinger, Christoph Bodensteiner, Michael Arens, Rainer Stiefelhagen

Abstract

We present a method to reconstruct the three-dimensional trajectory of a moving instance of a known object category using stereo video data. We track the two-dimensional shape of objects on pixel level exploiting instance-aware semantic segmentation techniques and optical flow cues. We apply Structure from Motion (SfM) techniques to object and background images to determine for each frame initial camera poses relative to object instances and background structures. We refine the initial SfM results by integrating stereo camera constraints exploiting factor graphs. We compute the object trajectory by combining object and background camera pose information. In contrast to stereo matching methods, our approach leverages temporal adjacent views for object point triangulation. As opposed to monocular trajectory reconstruction approaches, our method shows no degenerated cases. We evaluate our approach using publicly available video data of vehicles in urban scenes.

Abstract (translated)

我们提出了一种使用立体视频数据重建已知对象类别的移动实例的三维轨迹的方法。我们利用实例感知语义分割技术和光流提示在像素级跟踪对象的二维形状。我们将结构从运动(SfM)技术应用于对象和背景图像,以确定每个帧相对于对象实例和背景结构的初始相机姿势。我们通过利用因子图集成立体相机约束来优化初始SfM结果。我们通过组合对象和背景相机姿势信息来计算对象轨迹。与立体匹配方法相比,我们的方法利用时间相邻视图进行对象点三角测量。与单眼轨迹重建方法相反,我们的方法显示没有退化的情况。我们使用公共可用的城市场景中的车辆视频数据来评估我们的方法。

URL

https://arxiv.org/abs/1808.09297

PDF

https://arxiv.org/pdf/1808.09297.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot