东北大学学报(自然科学版) ›› 2010, Vol. 31 ›› Issue (2): 172-176.DOI: -

• 论著 • 上一篇    下一篇

Web数据库查询结果的自动分类方法

孟祥福;马宗民;严丽;张富;   

  1. 东北大学信息科学与工程学院;
  • 收稿日期:2013-06-20 修回日期:2013-06-20 出版日期:2010-02-15 发布日期:2013-06-20
  • 通讯作者: -
  • 作者简介:-
  • 基金资助:
    教育部新世纪优秀人才支持计划项目(NCET-05-0288);;

Automated categorization of web database query results

Meng, Xiang-Fu (1); Ma, Zong-Min (1); Yan, Li (1); Zhang, Fu (1)   

  1. (1) School of Information Science and Engineering, Northeastern University, Shenyang 110004, China
  • Received:2013-06-20 Revised:2013-06-20 Online:2010-02-15 Published:2013-06-20
  • Contact: Meng, X.-F.
  • About author:-
  • Supported by:
    -

摘要: 为解决Web数据库查询中出现的多查询结果问题,提出一种Web数据库查询结果自动分类方法.该方法在查询结果上动态生成一个带标签的、分层的分类树.分类树的构建通过两个处理阶段完成:首先在离线阶段分析系统中所有用户的查询历史并聚合语义上相似的查询,然后根据聚合的查询将原始数据划分成多个元组聚类,每个元组聚类对应一种类型的用户偏好;当用户查询到来时,在线查询处理阶段利用第一阶段生成的元组聚类,在查询结果集上为用户生成一个分类树,使得用户能够方便地选择和定位所需信息.实验和分析表明,提出的分类方法能够很好地满足用户个性化查询的需求.

关键词: Web数据库, 用户偏好, 元组聚类, C4.5算法, 查询结果分类

Abstract: To deal with the problem of too many results obtained from a Web database in response to a user query, a novel approach is proposed to categorize the Web database query results automatically, i.e., to generate a labeled, hierarchical navigational tree dynamically over the query results. The hierarchy of the tree includes two processing steps. At the first step, the query histories of all users in the system are analyzed offline to cluster the similar queries semantically, then the original data are divided into multiple tuple clusters in accordance to the clustered queries, among which each tuple cluster corresponds to a certain type of user preferences. When a user queries about something, at the online step, a navigational tree over the clusters generated at the first step is provided for the user to enable him to select and locate the information he needs easier. Experimental results revealed that the categorization approach proposed can meet the user's personalized requirements effectively.

中图分类号: