Abstract:
To solve the problem that the existing semi-supervised video target segmentation methods cannot ensure segmentation accuracy and efficiency at the same time, an attention mechanism into the general semi-supervised video target segmentation method was introduced to modify segmentation results. First, an appearance feature extraction subnet was constructed to extract feature map of the first frame of video and it was used as appearance guidance information. Second, the segmentation result of the previous frame was obtained and used as position guidance information. Finally, a current frame feature extraction subnet was constructed, which combined position correction attention and appearance correction attention in a double branch structure, so as to integrate the position information and appearance information into the current frame feature map and accomplish the target segmentation. Experiments show that the target segmentation method can correct the propagation errors in video target segmentation and improve the segmentation accuracy.