  1. 东北大学信息科学与工程学院;东北大学信息科学与工程学院;东北大学信息科学与工程学院;东北大学信息科学与工程学院 辽宁沈阳110004;辽宁沈阳110004;辽宁沈阳110004;辽宁沈阳110004
  • 收稿日期:2013-06-22 修回日期:2013-06-22 出版日期:2008-05-15 发布日期:2013-06-22
  • 通讯作者: Huo, H.
PDT-based document fragmentation of XML streaming data

Huo, Huan (1); Han, Dong-Hong (1); Hui, Xiao-Yun (1); Wang, Guo-Ren (1)   

  1. (1) School of Information Science and Engineering, Northeastern University, Shenyang 110004, China
  • Received:2013-06-22 Revised:2013-06-22 Online:2008-05-15 Published:2013-06-22
  • Contact: Huo, H.
摘要: 与传统数据库对XML数据的处理不同,对XML流数据的处理不仅受实时性的约束,还受存储空间的限制.在Hole-Filler模型的基础上,首先利用XML的查询统计信息,定义了路径频率树,提出了基于兄弟关系的XML流数据剪切分片策略及其算法.在此基础上,提出了基于父子关系的XML流数据剪切分片策略及算法.这两个基于路径频率树的剪切算法有效地提高了XML片段的利用率,增强了XML片段的内聚性.实验结果表明,基于路径频率树的XML剪切算法在剪切时间、查询时间、空间消耗等方面都表现出较好的性能.

关键词: XML, 数据流, 路径频率树, 剪切, Hole-Filler模型

Abstract: Unlike in conventional databases, queries on XML stream data are bounded by not only the memory capacity but also the real time processing. Based on the Hole-Filler model, a path frequency tree (PFT) is defined according to the statistic information on queries about XML to set out a sibling-based document fragmentation policy including corresponding algorithm. Then, an alternative membership-based document fragmentation policy and corresponding algorithm are proposed. Both algorithms can effectively enhance the utilization and cohesion of XML fragments. Testing results showed that the PFT-based document fragmentation algorithms perform well on query cost and other properties.
