东北大学学报(自然科学版) ›› 2010, Vol. 31 ›› Issue (2): 177-180.DOI: -

• 论著 • 上一篇    下一篇

基于不确定数据的分布式Top-k查询算法

王爽;王国仁;   

  1. 东北大学软件学院;东北大学信息科学与工程学院;
  • 收稿日期:2013-06-20 修回日期:2013-06-20 出版日期:2010-02-15 发布日期:2013-06-20
  • 通讯作者: -
  • 作者简介:-
  • 基金资助:
    国家自然科学基金资助项目(60873011)

Distributed Top-k query algorithm based on uncertain data

Wang, Shuang (1); Wang, Guo-Ren (2)   

  1. (1) School of Software, Northeastern University, Shenyang 110004, China; (2) School of Information Science and Engineering, Northeastern University, Shenyang 110004, China
  • Received:2013-06-20 Revised:2013-06-20 Online:2010-02-15 Published:2013-06-20
  • Contact: Wang, S.
  • About author:-
  • Supported by:
    -

摘要: 目前基于不确定数据的Top-k查询算法仅考虑了集中式的环境,为了解决分布式系统中节省系统带宽的问题,在此基础上,提出了在分布式环境中基于不确定数据的Top-k查询算法UDTopk.该算法定义了一个候选集(candidate set),仅使用候选集中的数据,而不用访问数据集中所有数据,就可以得到正确的Top-k查询答案.算法通过动态维护候选集、仅传输少量数据,达到减少网络中数据传输的目的.实验结果表明,该算法可以有效地节省网络带宽.

关键词: Top-k查询, 不确定数据, 分布式处理, 通信代价, 查询处理

Abstract: Top-k query based on uncertain data has quickly attracted a lot of interested users, however, none of them has addressed himself to that the algorithm works in a distributed setting. A distributed Top-k algorithm based on uncertain data(UDTopk) is therefore presented to save the communication bandwidths. A data structure called candidate set is designed and proposed, where only the minimum amount of data is contained and the tuples that have been removed from the set will not affect the answer to a Top-k query. This algorithm presented can be dynamically maintained with new tuples being added, and only small amount of data is required to transmit, thus reducing the data transmission in the network. The experimental results showed that the UDTopk algorithm can effectively reduce the communication cost.

中图分类号: