Abstract:
Few-shot semantic segmentation is a challenging task in the field of computer vision. Its goal is to predict the segmentation mask of invisible images by using one or several existing images with dense segmentation annotations. For this task, a lightweight alignment network based on pyramid prototype for few-shot semantic segmentation was proposed. First, on the basis of the MobileNetV2 network's depthwise separable convolution and inverse residual structure, the features of the network were extracted through the pyramidal pooling module, information of high and low dimensions were maintained, and features of different scales were obtained. At the same time, by aligning the prototype of the support set with the query set, the network can learn more information from the support set and make full use of the information of the support set for feedback. A large number of experimental results based on the PASCAL-5
i data set show that the mean-IoU of the network structures proposed in this paper were 49.5% and 56.6% at 1-way 1-shot and 1-way 5-shot, which were respectively 1.4% and 0.9% higher than that of the advanced mainstream few-shot segmentation network PANet. The parameters of network was 3.0M, which was reduced by 11.7M compared with PANet, and the floating points of operations was significantly reduced. The experimental results show that the network is effective and efficient in few-shot semantic segmentation.