动态场景下基于深度学习的语义视觉SLAM

    Semantic Visual SLAM Based on Deep Learning in Dynamic Scenes

    • 摘要: 针对基于静态场景假设的传统的同时定位与地图构建(simultaneous localization and mapping,SLAM)在动态场景中鲁棒性差、位姿估计准确率低的问题,提出了一种基于深度学习的语义视觉SLAM方法.该方法将语义分割技术与运动一致性检测算法相结合,首先用Mask R-CNN网络对图像进行语义分割,建立动态对象的先验语义信息,然后通过运动一致性检测算法进一步剔除属于动态物体的特征点,最后用静态特征点进行特征匹配和位姿估计.基于慕尼黑工业大学(Technical University of Munich,TUM)公开数据集对系统进行实验,结果表明,该系统在动态环境中较传统的ORB-SLAM2系统和DS-SLAM系统明显降低了绝对轨迹误差和相对位姿误差,提高了SLAM系统位姿估计的准确性和鲁棒性.

       

      Abstract: Aiming at the problem that the traditional simultaneous localization and mapping (SLAM) is based on static scene assumption, which has poor robustness and low accuracy of pose estimation in dynamic scenes, a semantic visual SLAM method based on deep learning was proposed. The semantic segmentation with the moving consistency check algorithm was combined in this method. First, the image was segmented by the Mask R-CNN network, and the prior semantic information of the dynamic object was established. Then, the feature points belonging to the dynamic object were further removed by the moving consistency check algorithm. Finally, the static feature points were used for feature matching and pose estimation. The experiment of the system was conducted based on the open data set of Technical University of Munich (TUM). Results show that the system reduces the absolute trajectory error and relative pose error compared with the traditional ORB-SLAM2 and DS-SLAM in dynamic scenes, and the accuracy and robustness of pose estimation of the SLAM system are improved.

       

    /

    返回文章
    返回