基于堆叠降噪自编码器的异质网络的层次构建与节点分类

Hierarchy Construction and Classification of Heterogeneous Information Networks Based on Stacked Denoising Auto Encoder

摘要: 针对传统特征抽取方法不能很好解决含有丰富语义信息和复杂网络结构的异质网的数据稀疏和噪声问题，利用堆叠降噪自编码器进行特征抽取，有利于松弛策略建立其类别层次结构，完成节点的分类和排序.在计算机科学文献库（digital bibliography & library project，DBLP）数据集上的实验结果表明：相比于其他分类算法，该方法分类性能更优，精确率可达86.3%.

Abstract: The problem of data with noise and sparsity of heterogeneous information networks can not be solved by the traditional feature extraction methods efficiently due to their semantics and complicated structure. Stacked denoising auto encoder was introduced to learn the features of sample. The relax strategy was employed to construct class hierarchy with high-quality, and then the nodes of the heterogeneous information network were classified and ranked. Experimental results on the dataset of DBLP (digital bibliography & library project) show that the method is effective, and the precision of classification is 86.3%.