基于注意力-残差双特征流卷积神经网络的深度图帧内编码单元快速划分算法
Fast Partition Algorithm in Depth Map Intra Coding Unit Based on Attention-Residual Bi-feature Stream Convolutional Neural Network
-
摘要: 针对三维高效视频编码(three-dimensional high efficiency video coding,3D-HEVC)深度图编码单元(coding unit,CU)划分复杂度高的问题,提出一种基于卷积神经网络(convolutional neural networks,CNN)的算法来实现快速深度图帧内编码。首先,提出一种具有 3 个分支的注意力-残差双特征流卷积神经网络(attention-residual bi-feature stream convolutional neural networks,ARBS-CNN) 模型,其中基于残差模块(res module,RM) 和特征蒸馏(feature distill,FD)模块的 2 个分支用于提取全局图像特征,基于动态模块(dynamic module, DM)和卷积-卷积块注意力模块(convolutional-convolutional block attention module, Conv-CBAM)的分支用于提取局部图像特征;然后,将提取到的特征进行整合并输出,得到对深度图 CU 划分结构的预测;最后,将 ARBS-CNN 嵌入到 3D-HEVC 测试平台中,利用预测结果加速深度图帧内编码。与原始算法相比,提出的方法能在维持率失真性能几乎不受影响的条件下,平均减少 74.4% 的编码时间。 实验结果表明,该算法能够在保持率失真性能的条件下,有效降低 3D-HEVC 的编码复杂度。Abstract: An algorithm based on convolutional neural networks (CNN) is proposed to achieve fast depth intra coding, solving the problem of high complexity in the three-dimensional high efficiency video coding (3D-HEVC ) depth map coding unit (CU ) partition. First, an attention-residual bi-feature stream convolutional neural networks (ARBS-CNN) framework with three branches was proposed, in which the global image features were extracted by two branches based on the residual module(RM) and the feature distillation (FD) module while local image features were extracted by the last branch based on the dynamic module (DM) and the convolutional-convolutional block attention module (Conv-CBAM). Subsequently, the extracted features were integrated and output to obtain the predictions for the structure of depth intra CU. Finally, ARBS-CNN were embedded into 3D-HEVC test platform, using the predicted results to achieve fast depth intra coding. Compared with the standard algorithm, the proposed method can reduce an average of 74. 4% of the intra coding time without a significant decrease in terms of rate distortion performance.