基于PLSA方法的用户兴趣聚类

doi:-

东北大学学报(自然科学版) ›› 2008, Vol. 29 ›› Issue (1): 53-56.DOI: -

基于PLSA方法的用户兴趣聚类

陈冬玲;王大玲;于戈;于芳;

东北大学信息科学与工程学院;东北大学信息科学与工程学院;东北大学信息科学与工程学院;东北大学信息科学与工程学院辽宁沈阳110004;辽宁沈阳110004;辽宁沈阳110004;辽宁沈阳110004

收稿日期:2013-06-22 修回日期:2013-06-22 出版日期:2008-01-15 发布日期:2013-06-22
通讯作者: Chen, D.-L.
作者简介:-
基金资助:
国家自然科学基金资助项目(60573090;60673139)

User interests clustering based on PLSA

Chen, Dong-Ling (1); Wang, Da-Ling (1); Yu, Ge (1); Yu, Fang (1)

(1) School of Information Science and Engineering, Northeastern University, Shenyang 110004, China

Received:2013-06-22 Revised:2013-06-22 Online:2008-01-15 Published:2013-06-22
Contact: Chen, D.-L.
About author:-
Supported by:
-

摘要/Abstract

摘要： 为了在个性化搜索过程中能够准确地挖掘到用户的潜在兴趣并进行相应的聚类分析,提出采用潜语义空间的Zipf分布的特性,并结合PLSA(概率潜在语义分析)来获取全文的语义.即先通过Zipf分布原理找到文档的潜在语义空间,在此空间中对用户的兴趣进行聚类,并建立用户兴趣描述文件(user profile),即建立用户兴趣层次树.实验表明,所提出聚类算法的聚类效果明显优于传统的VSM(向量空间模型)的聚类效果,同时,在著名的CTI数据集上的个性化推荐实验结果也充分说明基于潜在语义空间构建的用户兴趣描述与用户真实兴趣相符合.

关键词: 用户兴趣描述文件, PLSA, 潜语义空间, Zipf分布, 用户兴趣层次树

Abstract: To mine user's latent interests and make relevantly the clustering analysis during personalized search, it is proptxsed to combine the characteristics of Zipf distribution in latent semantic space with PLSA (the probability latent semantic analysis), so as to gain the semantemes of the whole text. Namely, the principle of Zipf distribution is introduced to find out the latent semantic space of files, where the user interest is clustered according to underlying factors and a user interest hierarchy tree is built in user profile. Experimental results show that the clustering result as proposed is clearly superior to that by the conventional VSM (vector space model) algorithm. In addition, the results of the recommended personalized experiment based on well-known CTI data set also indicates fully that the description of user profile on the basis of latent semantic space coincides actually with the user interest.

中图分类号:

陈冬玲;王大玲;于戈;于芳;. 基于PLSA方法的用户兴趣聚类[J]. 东北大学学报(自然科学版), 2008, 29(1): 53-56.

Chen, Dong-Ling (1); Wang, Da-Ling (1); Yu, Ge (1); Yu, Fang (1) . User interests clustering based on PLSA[J]. Journal of Northeastern University, 2008, 29(1): 53-56.

[1]	靳树梁. 第一次全國工業爐热工科學討論會開幕詞[J]. 东北大学学报（自然科学版）, 1956, 0(1): 1-2.
[2]	И.С.НАЗРОВ;李承仁. 工業爐設計的基本原则[J]. 东北大学学报（自然科学版）, 1956, 0(1): 3-10.
[3]	梁宁元. 東北煤的合理利用和有效使用[J]. 东北大学学报（自然科学版）, 1956, 0(1): 11-20.
[4]	П.А.МАСЮДИН;苑永生. 蘇聯在工業爐上粉煤的應用[J]. 东北大学学报（自然科学版）, 1956, 0(1): 21-26.
[5]	梁宁元. 由煤的工業分析值進行燃燒計算的圖解法[J]. 东北大学学报（自然科学版）, 1956, 0(1): 27-51.
[6]	胡彦邦. 現代化加熱爐的發展道路[J]. 东北大学学报（自然科学版）, 1956, 0(1): 52-80+363.
[7]	任世铮. 論冶金爐計算中的平均值[J]. 东北大学学报（自然科学版）, 1956, 0(1): 81-114.
[8]	高家鋭. 对目前使用固體燃料連續加熱爐改善的幾點意見[J]. 东北大学学报（自然科学版）, 1956, 0(1): 115-122.
[9]	寗寶林. 關於鋼丝热處理溫度舆時間的研究[J]. 东北大学学报（自然科学版）, 1956, 0(1): 123-142.
[10]	陸伯之. 平爐用鉻鎂磚的破損機構[J]. 东北大学学报（自然科学版）, 1956, 0(1): 143-153.
[11]	И.С.НАЗАРОВ;李承仁. 予热空氣的功效和方法[J]. 东北大学学报（自然科学版）, 1956, 0(1): 154-159.
[12]	徐業鹏. 化鐵爐的熱送風問題[J]. 东北大学学报（自然科学版）, 1956, 0(1): 160-168.
[13]	陸锺武. 用热電高溫計测量爐内溫度及輻射熱流的計算問題[J]. 东北大学学报（自然科学版）, 1956, 0(1): 169-191.
[14]	张念村;许毓秋. 溫度雙位調節中自激振盪的鎮定[J]. 东北大学学报（自然科学版）, 1956, 0(1): 192-201.
[15]	A·A·舒米林;汪培礽. 耐火材料工業熱工設備的自動裝置[J]. 东北大学学报（自然科学版）, 1956, 0(1): 202-215.

基于PLSA方法的用户兴趣聚类

User interests clustering based on PLSA

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价