东北大学学报(自然科学版) ›› 2013, Vol. 34 ›› Issue (3): 348-350.DOI: -

• 信息与控制 • 上一篇    下一篇

一种用于查询扩展词选取的主题模型

张博,张斌,高克宁   

  1. (东北大学信息科学与工程学院,辽宁沈阳110819)
  • 收稿日期:2012-09-03 修回日期:2012-09-03 出版日期:2013-03-15 发布日期:2013-01-26
  • 通讯作者: 张博
  • 作者简介:张博(1981-),男,辽宁沈阳人,东北大学博士研究生;张斌(1964-),男,辽宁本溪人,东北大学教授,博士生导师;高克宁(1963-),女,辽宁沈阳人,东北大学教授.
  • 基金资助:
    辽宁省自然科学基金资助项目(20102060).

A Topic Model for Extracting Expansion Items

ZHANG Bo, ZHANG Bin, GAO Kening   

  1. School of Information Science & Engineering, Northeastern University, Shenyang 110819, China.
  • Received:2012-09-03 Revised:2012-09-03 Online:2013-03-15 Published:2013-01-26
  • Contact: ZHANG Bin
  • About author:-
  • Supported by:
    -

摘要: 为能在搜索引擎返回的结果集上构建贴近用户意图的主题层,并在文档词与主题间建立映射,将社会化标注引入经典的LDA模型,构建一种基于主题-标签-文档词之间关系的三层主题模型,并将其用于伪相关反馈查询扩展词的选取.实验结果表明,该模型提取的查询扩展词能描述标签的语义,模型用于伪相关反馈后,提取的扩展词能覆盖查询条件,在多数情况下结果列表的NDCG值高于基本伪相关反馈和结果集聚类方法.

关键词: 主题模型, 伪相关反馈, 查询扩展, 扩展词选取, 社会化标注

Abstract: Topic model can help pseudo feedback in query expansion. The main shortcoming of classic topic model is that a topic level needs to be assumed. For constructing the topic levels of closing users on corpus and creating the mapping between topics and words, social annotation was introduced into classic topic model LDA(latent dirichlet allocation), and a threelevel topic model of topic, label and word was constructed, which was applied to choose query expansion of pseudo feedback. The results showed that this model can describe the semantic of the label, and extract the expansion items that covered the query. The model’s NDCG values are higher than those of the classic pseudo feedback and result set clustering.

Key words: topic model, pseudo feedback, query expansion, word extraction, social annotation

中图分类号: