东北大学学报(自然科学版) ›› 2005, Vol. 26 ›› Issue (11): 31-34.DOI: -

• 论著 • 上一篇    下一篇

数据流滑动窗口上的一种多聚集查询共享策略

于亚新;朱歆华;于戈   

  1. 东北大学信息科学与工程学院;东软集团有限公司;东北大学信息科学与工程学院 辽宁沈阳110004
  • 收稿日期:2013-06-24 修回日期:2013-06-24 出版日期:2005-11-15 发布日期:2013-06-24
  • 通讯作者: Yu, Y.-X.
  • 作者简介:-
  • 基金资助:
    国家自然科学基金资助项目(60473073);;

Sharing strategy supporting multi-aggregate queries in sliding window over data streams

Yu, Ya-Xin (1); Zhu, Xin-Hua (2); Yu, Ge (1)   

  1. (1) School of Information Science and Engineering, Northeastern University, Shenyang 110004, China; (2) Neusoft Group Ltd. Co., Shenyang 110179, China
  • Received:2013-06-24 Revised:2013-06-24 Online:2005-11-15 Published:2013-06-24
  • Contact: Yu, Y.-X.
  • About author:-
  • Supported by:
    -

摘要: 基于如何提高多个聚集查询的查询效率,提出了一种共享链树结构的多查询聚集计算共享策略.利用每个滑动窗口逻辑地将链树划分成若干子树,它们各自根节点中的聚集值恰好就是每个查询对应的结果值.数据流上多个查询可以在同一棵链树上并发执行,避免了每个查询的重复建树.同时,利用链树本身可降低无用的重复比较的性质,使得多个查询在同一棵链树上只需进行少量更新比较就可求得每个查询的新聚集值.实验证明,多个查询共用同一链树结构,可以最大程度地共享资源,减小内存使用量,因此大大提高了数据流上的查询处理效率,加大了数据流上的任务吞吐量,改善了系统性能.

关键词: 数据流, 滑动窗口, 聚集查询, 共享, 链树, 跳数

Abstract: How to improve the efficiency of multi-aggregate queries is a key problem, to which a novel linked-tree sharing strategy is proposed to support multi-aggregate queries in sliding window over data streams. In this way, each and every sliding window will divide the linked-tree logically into several sub-trees where the aggregate values of different root nodes are just the answers corresponding to their aggregate queries. This kind of logical division makes several queries run in a tree simultaneously, thus avoiding the repeated construction of different linked-trees. Taking advantage of the characteristic of the linked-tree available to reduce the number of fruitlessly repeated comparisons, the aggregate value of each query can be given with few updates/comparisons to do when many queries are required for answers from the same linked-tree. Extensive experiments showed that sharing the same linked-tree can cause lots of queries to share the resources with memory content decreased accordingly. As a result, the query processing efficiency, throughputs of real time tasks and system's performance are all improved dramatically.

中图分类号: