Abstract:
Aiming at the problem that the traditional simultaneous localization and mapping (SLAM) is based on static scene assumption, which has poor robustness and low accuracy of pose estimation in dynamic scenes, a semantic visual SLAM method based on deep learning was proposed. The semantic segmentation with the moving consistency check algorithm was combined in this method. First, the image was segmented by the Mask R-CNN network, and the prior semantic information of the dynamic object was established. Then, the feature points belonging to the dynamic object were further removed by the moving consistency check algorithm. Finally, the static feature points were used for feature matching and pose estimation. The experiment of the system was conducted based on the open data set of Technical University of Munich (TUM). Results show that the system reduces the absolute trajectory error and relative pose error compared with the traditional ORB-SLAM2 and DS-SLAM in dynamic scenes, and the accuracy and robustness of pose estimation of the SLAM system are improved.