融合动态区域检测的自监督视觉里程计方法

马伟; 贾兆款; 米庆

doi:10.11936/bjutxb2020120036

融合动态区域检测的自监督视觉里程计方法

Self-supervised Visual Odometry Method Based on Dynamic Region Detection

摘要

摘要: 为解决室外场景中动态区域对视觉里程计计算过程的干扰，获得准确的相机位姿和场景深度，提出一种自监督深度学习框架下融合动态区域检测的视觉里程计算法.给定相邻2帧图像，首先，采用深度估计网络计算2幅图像对应深度图，采用位姿估计网络获得二者初始相对位姿.然后，借助视点变换，计算两视角深度图像之间的差异，确定动态区域.在此基础上，对输入图像中动静态区域进行分离.之后，匹配两视角图像静态区域特征，计算最终相机位姿.从光度、平滑度以及几何一致性三方面构造损失函数，并在损失函数中融入动态区域信息，对所构造网络模型进行端到端自监督训练.在KITTI数据集上验证了所提算法，并将其与最近2年提出的相关算法进行比较.实验结果表明，该算法能够更好地应对动态场景，实现更高精度的相机姿态估计和细小物体深度估计.

Abstract: A robust visual odometry in the framework of deep self-supervision, which can overcome the interference of dynamic regions in camera pose estimation and scene depth computation, was proposed. Two consecutive frames of images were selected. Depth estimation network was first used to calculate the depth maps of the two images, and pose estimation network was used to calculate the relative poses. Then, dynamic regions were obtained by comparing a depth map with the one synthesized by warping the other depth map into the current view. Based on the detected dynamic regions, the dynamic and static parts in input images were seperated with this algorithm. Then, the features in the static regions of the two images were matched to obtain the final camera poses. The loss function was composed of photometric error, smoothness error, and geometric consistency error, and dynamic region information was integrated into the loss function. Finally, this loss function was used to train the proposed network in a self-supervised end-to-end manner. Experiments were carried out on the widely used KITTI dataset. Specifically, the proposed model was compared with the state-of-the-art ones proposed in recent two years. Experimental results show that this algorithm can deal with the dynamic scenes more robustly and achieve higher accuracy in camera pose estimation and scene depth computation.

HTML全文

参考文献(36)

施引文献

资源附件(0)