东北大学学报:自然科学版 ›› 2016, Vol. 37 ›› Issue (1): 24-28.DOI: 10.12068/j.issn.1005-3026.2016.01.006

• 信息与控制 • 上一篇    下一篇

基于索引效用的Top-k高效用项集挖掘方法

林树宽, 王晓丛, 乔建忠, 王蕊   

  1. (东北大学 信息科学与工程学院, 辽宁 沈阳110819)
  • 收稿日期:2014-10-31 修回日期:2014-10-31 出版日期:2016-01-15 发布日期:2016-01-08
  • 通讯作者: 林树宽
  • 作者简介:林树宽(1966-),女,吉林长春人,东北大学教授; 乔建忠(1964-),男,辽宁兴城人,东北大学教授, 博士生导师.
  • 基金资助:
    国家自然科学基金资助项目(61272177).

A Top-k High Utility Itemset Mining Method Based on the Index Utility

LIN Shu-kuan, WANG Xiao-cong, QIAO Jian-zhong, WANG Rui   

  1. School of Information Science & Engineering, Northeastern University, Shenyang 110819, China.
  • Received:2014-10-31 Revised:2014-10-31 Online:2016-01-15 Published:2016-01-08
  • Contact: LIN Shu-kuan
  • About author:-
  • Supported by:
    -

摘要: 已有的Top-k高效用项集挖掘为了保持向下封闭性,利用项集的事务效用代替其真实效用,使得项集效用被估计得过大,导致剪枝效果不好,挖掘效率较低.针对这一问题,提出了索引效用的概念,在此基础上建立两级索引,并进行索引剪枝,增强了挖掘中剪枝的效果,提高了Top-k高效用项集挖掘的效率;此外,通过建立效用矩阵,支持对项集效用的快速计算,进一步提高了挖掘效率.不同类型数据集上的实验验证了所提出的Top-k高效用项集挖掘方法的有效性和高效性.

关键词: 项集效用, 索引效用, Top-k高效用项集, 尾超项集, 效用矩阵

Abstract: The existing methods of Top-k high utility itemset mining substitute the transaction utilities of itemsets for their real utilities in order to keep the downward closure property. This makes the utilities of itemsets be estimated too large, resulting in bad pruning effect and low mining efficiency. To solve this problem, the concept of the index utility was proposed. On this basis, the two-level index was built and pruned, by which the pruning effect was strengthened and the efficiency of Top-k high utility itemset mining was enhanced. Moreover, the fast calculation of itemset utilities was supported by building the utility matrix. Therefore, the mining efficiency was further enhanced. The experiments on different types of datasets validate the effectiveness and the efficiency of the proposed method.

Key words: itemset utility, the index utility, Top-k high utility itemset, ending super itemset, utility matrix

中图分类号: