Abstract
Optical Image Stabilization (OIS) system in mobile devices reduces image blurring by steering lens to compensate for hand jitters. However, OIS changes intrinsic camera parameters (i.e. $\mathrm{K}$ matrix) dynamically which hinders accurate camera pose estimation or 3D reconstruction. Here we propose a novel neural network-based approach that estimates $\mathrm{K}$ matrix in real-time so that pose estimation or scene reconstruction can be run at camera native resolution for the highest accuracy on mobile devices. Our network design takes gratified projection model discrepancy feature and 3D point positions as inputs and employs a Multi-Layer Perceptron (MLP) to approximate $f_{\mathrm{K}}$ manifold. We also design a unique training scheme for this network by introducing a Back propagated PnP (BPnP) layer so that reprojection error can be adopted as the loss function. The training process utilizes precise calibration patterns for capturing accurate $f_{\mathrm{K}}$ manifold but the trained network can be used anywhere. We name the proposed Dynamic Intrinsic Manifold Estimation network as DIME-Net and have it implemented and tested on three different mobile devices. In all cases, DIME-Net can reduce reprojection error by at least $64\%$ indicating that our design is successful.
Abstract (translated)
在移动设备上,光学图像稳定器(OIS)系统可以减少镜头引导镜片造成的图像模糊,以补偿手抖动。然而,OIS动态地改变了相机固有的参数(即$\mathrm{K}$矩阵),这妨碍了准确的相机姿态估计或三维重建。在此,我们提出了一种基于神经网络的新方法,实时估计$\mathrm{K}$矩阵,以便在移动设备上以相机原生分辨率运行姿态估计或场景重建,以获得最高的准确性。我们的网络设计使用了满足愉悦投影模型差异特征和3D点位置的输入,并使用多层感知器(MLP)近似$f_{\mathrm{K}}$万维网。我们还设计了一个独特的训练方案,引入了反向传播PnP层,以便将投影误差作为损失函数使用。训练过程使用了精确的校准模式来捕捉准确的$f_{\mathrm{K}}$万维网,但训练网络可以应用于任何设备。我们称之为动态固有万维网估计网络,将其在三个不同的移动设备上实现和测试。在所有情况下,DIME-Net都可以至少减少投影误差64%,这表明我们的设计是成功的。
URL
https://arxiv.org/abs/2303.11307