• 综合性科技类中文核心期刊
    • 中国科技论文统计源期刊
    • 中国科学引文数据库来源期刊
    • 中国学术期刊文摘数据库(核心版)来源期刊
    • 中国学术期刊综合评价数据库来源期刊

XML数据流中面向聚类的指数直方图

高明霞, 姚文集, 毛国君

高明霞, 姚文集, 毛国君. XML数据流中面向聚类的指数直方图[J]. 北京工业大学学报, 2011, 37(8): 1242-1248.
引用本文: 高明霞, 姚文集, 毛国君. XML数据流中面向聚类的指数直方图[J]. 北京工业大学学报, 2011, 37(8): 1242-1248.
GAO Ming-xia, YAO Wen-ji, MAO Guo-jun. Exponential Histogram of Cluster Feature for XML Stream[J]. Journal of Beijing University of Technology, 2011, 37(8): 1242-1248.
Citation: GAO Ming-xia, YAO Wen-ji, MAO Guo-jun. Exponential Histogram of Cluster Feature for XML Stream[J]. Journal of Beijing University of Technology, 2011, 37(8): 1242-1248.

XML数据流中面向聚类的指数直方图

基金项目: 

国家自然科学基金资助项目(60496322)

北京工业大学博士启动基金资助项目(X0007011200901).

详细信息
    作者简介:

    高明霞(1973—),女,河北张北人,讲师.

  • 中图分类号: TP311

Exponential Histogram of Cluster Feature for XML Stream

  • 摘要: 为了实现XML(extensible markup language)数据流的在线动态聚类,提出一种XML聚类特征指数直方图.该结构以XML时间聚类特征为基础,遵循指数直方图的维护规律.采用该结构的聚类算法在真实和模拟数据集上的实验结果说明:这一结构在聚类质量上可以达到甚至超过静态聚类方法;直方图个数固定时,内存开销基本稳定.
    Abstract: To mine XML stream in an online way,a data structure named exponential histogram of cluster feature for XML is proposed.The structure is based on the temporal cluster feature and can be maintained according to the exponential histogram rule.The experiment results for a real data set and a synthetic data set show that the structure is of higher quality than the method offline.
  • [1]

    BRAY T,PAOLI J,SPERBERG-MCQUEEN C M,et al.Extensible markup language(XML)1.0[S/OL].5th ed[2009-07-08].http:∥www.w3.org/TR/REC-xml/.

    [2]

    ALGERGAWY A,SCHALLEHN E,SAAKE G.A schema matching-based approach to XML schema clustering[C]∥Proceedings of the 10th International Conference on Information Integration and Web-based Applications&Services 2008.NewYork:ACM Press,2008:131-136.

    [3]

    LIAN W,WAI-LOK Cheung D,MAMOULIS N,et al.An efficient and scalable algorithm for clustering XML documents bystructure[J]∥IEEE Transactions on Knowledge and Data Engineering,2004,16(1):82-96.

    [4]

    COSTA G,MANCO G,ORTALE R,et al.Atree-based approach to clustering XML documents by structure[C]∥KnowledgeDiscovery in Databases:PKDD 2004.Berlin:Springer-Verlag,2004:137-148.

    [5]

    NIERMAN A,JAGADISHHV.Evaluating structural similarity in XML documents[C]∥Proceedings of the 5th InternationalWorkshop on the Web and Databases.Madison:ACM Press,2002:61-66

    [6] 郑仕辉,周傲英,张龙.XML文档的相似测度和结构索引研究[J].计算机学报,2003,26(9):1116-1122.ZHENG Shi-hui,ZHOU Ao-ying,ZHANG Long.Similarity measure and structural index of XML documents[J].ChineseJournal of Computers,2003,26(9):1116-1122.(in Chinese)
    [7]

    DALAMAGAS T,CHENG T,WINKEL K J,et al.Clustering XML documents by structure[C]∥SETN 2004.Berlin:Springer-Verlag,2004:112-121.

    [8]

    YOON J,RAGHAVAN V,CHAKILAM V.BitCube:clustering and statistical analysis for XML documents[C]∥ThirteenthInternational Conference on Scientific and Statistical Database Management.Fairfax:IEEE Computer Society,2001:18-20.

    [9] 杨建武,陈晓鸥.基于核矩阵学习的XML文档相似度量方法[J].软件学报,2006,17(5):991-1000.YANG Jian-wu,CHEN Xiao-ou.Similarity measures for XML documents based on kernel matrix learning[J].Journal ofSoftware,2006,17(5):991-1000.(in Chinese)
    [10]

    BERTINO E,GUERRINI G,MESITI M.Measuring the structural similarity among XML documents and DTDs,DISI 2TR202202[R].Genova:Department of Computer Science,University of Genova,2002.

    [11]

    FLESCA S,MANCO G,SCIARI E M,et al.Detecting structural similarities between XML documents[C]∥Proceedings ofthe 5th International Workshop on the Web and Databases,WebDB.Madison:ACM Press,2002:55-60.

    [12]

    NAYAK R.Fast and effective clustering of XML data using structural information[J].Knowl Inf Syst,2008,14:197-215.

    [13] 姚文集,高明霞,毛国君,等.基于滑动窗口的XML数据流聚类算法[J].计算机工程,2010,36(13):87-89.YAO Wen-ji,GAO Ming-xia,MAO Guo-jun,et al.Algorithm for clustering XML data stream using sliding window[J].Computer Engineering,2010,36(13):87-89.(in Chinese)
    [14] 金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1180JIN Che-qing,QIAN Wei-ning,ZHOU Ao-ying.Analysis and management of streaming data:a survey[J].Journal ofSoftware,2004,15(8):1172-1180.(in Chinese)
    [15] 常建龙,曹锋,周傲英.基于滑动窗口的进化数据流聚类[J].软件学报,2007,18(4):905-918.CHANG Jian-long,CAO Feng,ZHOU Ao-ying.Clustering evolving data streams over sliding windows[J].Journal ofSoftware,2007,18(4):905-918.(in Chinese)
计量
  • 文章访问数:  13
  • HTML全文浏览量:  2
  • PDF下载量:  7
  • 被引次数: 0
出版历程
  • 收稿日期:  2009-09-07
  • 网络出版日期:  2022-11-18

目录

    /

    返回文章
    返回