基于多隐层极限学习机的文本分类方法

    Text Classification Method Based on Multi-layer Extreme Learning Machine

    • 摘要: 针对正则化极限学习机处理高维文本数据时文本特征表示能力不足的问题,提出了一种基于多隐层极限学习机的文本分类方法.首先,使用极限学习机自编码器的压缩表示对高维文本数据进行降维处理.然后,通过多隐层极限学习机的多隐层结构提取出高层文本特征并通过最小二乘的方法对文本数据进行分类.与多个算法的实验对比表明,该算法在20newsgroup、Reuters和复旦大学中文语料库这3个数据集上都具有良好的分类性能.

       

      Abstract: When the dimension of text data is high, the regularized extreme learning machine (ELM) of single hidden layer structure has not enough ability to express feature in the text classification. To solve the problem, this paper presented a text classification method based on multi-layer extreme learning machine (ML-ELM). First, the method used the compressed representation of extreme learning machine-based auto-encoder (ELM-AE) to reduce the dimension of the text data. Then, the structure of the multi-hidden was used to represent high-level features in the text data, and the method of least squares was used to classify the text data. The experimental results on Reuters, 20newsgroup and Fudan University Chinese Corpus datasets show that this algorithm has a good classification performance compared with other algorithms.

       

    /

    返回文章
    返回