基于特征聚类和等距映射的无监督特征选择算法

Unsupervised Feature Selection Algorithm Based on Feature Clustering and Isometric Mapping

摘要: 为了提高无标签场景下特征选择的准确率和稳定性，提出一种基于特征聚类和等距映射的无监督特征选择算法。特征聚类将相似性较高的特征聚成一类，然后结合等距映射和稀疏系数矩阵定义新的特征得分计量函数。该函数对各特征簇中的特征进行打分，选择出每个类簇中得分最高的代表特征，构成特征子集。在14个广泛应用的数据集上的实验结果表明：本文所提算法能够选择出具有强分类能力的特征，且算法具有很强的泛化性。

Abstract: To improve the accuracy and stability of feature selection in label-free scenarios, an unsupervised feature selection algorithm based on feature clustering and isometric mapping was proposed. Feature clustering clustered features with high similarity into one class, and a new feature score measurement function was defined by combining isometric mapping and sparse coefficient matrix. This function scored the features in each feature cluster and selected the representative features with the highest scores in each class cluster to form a feature subset. Experimental results on fourteen widely used datasets show that the proposed algorithm can select features with strong classification ability and the algorithm is highly generalizable.