金字塔原型对齐的轻量级小样本语义分割网络

    Lightweight Pyramid Prototype Alignment Network for Few-shot Semantic Segmentation

    • 摘要: 小样本图像语义分割任务是计算机视觉领域一个有挑战性的问题,其目标是利用现有一张或几张带有密集分割注释的图片来预测未见类图像的分割掩码.针对该任务,提出了一个基于金字塔原型对齐的轻量级小样本图像语义分割网络.首先,该网络在MobileNetV2网络的深度可分离卷积和逆残差结构基础上,通过金字塔池化模块进行提取特征,保持高维度和低维度的信息,获得不同尺度的特征.同时通过在支持集原型和查询集之间进行相互对齐,使得网络能够从支持集中学到更多的信息,充分利用支持集的信息进行反馈.基于PASCAL-5i数据集的大量实验结果表明,提出的网络结构的均值在1-way 1-shot和1-way 5-shot上分别为49.5%和56.6%,与先进的主流小样本语义分割网络PANet相比分别对应提高了1.4%和0.9%,网络参数量为3.0 MB,相比PANet减小了11.7 MB,同时浮点计算量显著减少,证明了该网络在小样本图像语义分割中的有效性和高效性.

       

      Abstract: Few-shot semantic segmentation is a challenging task in the field of computer vision. Its goal is to predict the segmentation mask of invisible images by using one or several existing images with dense segmentation annotations. For this task, a lightweight alignment network based on pyramid prototype for few-shot semantic segmentation was proposed. First, on the basis of the MobileNetV2 network's depthwise separable convolution and inverse residual structure, the features of the network were extracted through the pyramidal pooling module, information of high and low dimensions were maintained, and features of different scales were obtained. At the same time, by aligning the prototype of the support set with the query set, the network can learn more information from the support set and make full use of the information of the support set for feedback. A large number of experimental results based on the PASCAL-5i data set show that the mean-IoU of the network structures proposed in this paper were 49.5% and 56.6% at 1-way 1-shot and 1-way 5-shot, which were respectively 1.4% and 0.9% higher than that of the advanced mainstream few-shot segmentation network PANet. The parameters of network was 3.0M, which was reduced by 11.7M compared with PANet, and the floating points of operations was significantly reduced. The experimental results show that the network is effective and efficient in few-shot semantic segmentation.

       

    /

    返回文章
    返回