WEN Han, XIAO Nan-feng. Semi-supervised Classification Using Feature Distribution[J]. Journal of Beijing University of Technology, 2012, 38(1): 75-80.
    Citation: WEN Han, XIAO Nan-feng. Semi-supervised Classification Using Feature Distribution[J]. Journal of Beijing University of Technology, 2012, 38(1): 75-80.

    Semi-supervised Classification Using Feature Distribution

    • It is crucial for semi-supervised learning (SSL) to cut down the dimension of the feature space through feature selection.The popular information gain (IG) selection method,which inclines to high frequency words,always ignores similarity of classes.Thus,the classification performance of characteristics IG is unstable.This paper puts forward a feature distribution selection to help IG retain features possessing high categories discriminative information.To solve the inherent efficiency problem of the expectation maximization (EM) algorithm,unlabeled documents that possess maximum posterior category probability are transferred from unlabeled collection to labeled collection.The iteration number of the improved EM is obviously reduced.Finally,experimental evaluation on Reuter-21578 and Epinion.com with two different data sets shows that the semi-supervised learning method using feature distribution obtains very effective performance for micro average F1 criterion.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return