基于多模双线性池化和时间池化聚合的无参考VMAF视频质量评价模型

    No-reference VMAF Video Quality Assessment Model Based on Multi-mode Bilinear Pooling and Temporal Pooling Aggregation

    • 摘要: 为了解决在实际应用过程中很难获取到原始视频信息的问题,提出了一种无参考的视频多方法评估融合(video multimethod assessment fusion, VMAF)预测模型. 首先,采用一种基于多模双线性池化的卷积神经网络结构建立视频帧级的无参考VMAF预测模型,用于对失真视频帧的VMAF分数进行预测;其次,采用3种不同的时间池化方法对失真视频帧的VMAF预测分数分别进行聚合,将结果融合后得到一个质量特征向量;最后,采用nu-支持向量回归(nu support vector regression, NuSVR)的方法建立质量特征向量与视频VMAF分数之间的映射关系模型. 该模型不需要原始视频信息就可以预测失真视频的VMAF分数,具有应用价值. 实验结果表明,提出的模型可以获得较高的预测精度.

       

      Abstract: To solve the problem that it is difficult to obtain the original video information in the practical applications, a no-reference video multimethod assessment fusion (VMAF) prediction model was proposed in this paper. First, the VMAF scores of distorted video frames were predicted by adopting a frame level no-reference VMAF prediction model, which was established by a convolutional neural network based on multi-mode bilinear pooling operation. Second, the quality feature vector was obtained by fusing the aggregation results of the VMAF prediction scores of the distorted video frames by three different temporal pooling methods. Finally, the nu support vector regression (NuSVR) method was adopted to establish the mapping relationship model between the quality feature vector and the VMAF score of the video. The important application value is demonstrated that the proposed model can predict the VMAF score of the distorted videos without the original video information. Experimental results show that the proposed model can obtain higher prediction accuracy.

       

    /

    返回文章
    返回