XML数据流中面向聚类的指数直方图
Exponential Histogram of Cluster Feature for XML Stream
-
摘要: 为了实现XML(extensible markup language)数据流的在线动态聚类,提出一种XML聚类特征指数直方图.该结构以XML时间聚类特征为基础,遵循指数直方图的维护规律.采用该结构的聚类算法在真实和模拟数据集上的实验结果说明:这一结构在聚类质量上可以达到甚至超过静态聚类方法;直方图个数固定时,内存开销基本稳定.Abstract: To mine XML stream in an online way,a data structure named exponential histogram of cluster feature for XML is proposed.The structure is based on the temporal cluster feature and can be maintained according to the exponential histogram rule.The experiment results for a real data set and a synthetic data set show that the structure is of higher quality than the method offline.