江峰, 张友强, 杜军威, 刘国柱, 眭跃飞. 基于近似约简的集成学习算法及其在入侵检测中的应用[J]. 北京工业大学学报, 2016, 42(6): 877-885. DOI: 10.11936/bjutxb2015100008
    引用本文: 江峰, 张友强, 杜军威, 刘国柱, 眭跃飞. 基于近似约简的集成学习算法及其在入侵检测中的应用[J]. 北京工业大学学报, 2016, 42(6): 877-885. DOI: 10.11936/bjutxb2015100008
    JIANG Feng, ZHANG Youqiang, DU Junwei, LIU Guozhu, SUI Yuefei. Approximate Reducts-based Ensemble Learning Algorithm and Its Application in Intrusion Detection[J]. Journal of Beijing University of Technology, 2016, 42(6): 877-885. DOI: 10.11936/bjutxb2015100008
    Citation: JIANG Feng, ZHANG Youqiang, DU Junwei, LIU Guozhu, SUI Yuefei. Approximate Reducts-based Ensemble Learning Algorithm and Its Application in Intrusion Detection[J]. Journal of Beijing University of Technology, 2016, 42(6): 877-885. DOI: 10.11936/bjutxb2015100008

    基于近似约简的集成学习算法及其在入侵检测中的应用

    Approximate Reducts-based Ensemble Learning Algorithm and Its Application in Intrusion Detection

    • 摘要: 为了获得较大差异性的基学习器来构建集成学习器,从属性空间划分的角度来考虑集成学习问题,通过粗糙集理论定义了近似约简的概念,进一步提出了基于近似约简的集成学习算法;本方法将数据集的属性空间划分为多个子空间,基于不同子空间对应的数据集训练得到的基学习器具有较大的差异性,从而保证了集成学习器具有较强的泛化性能. 为了验证本算法的有效性,本算法被应用于网络入侵检测中. 在KDD CUP 99 数据集上的实验表明,与传统的集成学习算法相比,本文所提出的算法具有更高的检测率和更低的计算开销,更适合于从海量高维的网络数据中检测入侵.

       

      Abstract: To obtain diverse base learners for construct ensemble learner, the issue of ensemble learning was considered from the perspective of partitioning the attribute space. Through rough set theory, the concept of approximate reduct was defined, and further an approximate reducts-based ensemble learning algorithm was proposed. The method could partition the attribute space of data set into multiple subspaces, and the base learners trained on data sets corresponding to different subspaces had large diversity, which guarantee that the ensemble learner has strong generalization performance. To verify the effectiveness of the algorithm, it was applied to network intrusion detection. Experimental results on the KDD CUP 99 data set demonstrate that compared with the traditional ensemble learning algorithms, the proposed method has higher detection rate and lower computational cost, which is more suitable for the detection of intrusions from the massive and high-dimensional network data.

       

    /

    返回文章
    返回