Journal of Northeastern University ›› 2010, Vol. 31 ›› Issue (6): 782-785.DOI: -

• OriginalPaper • Previous Articles     Next Articles

Public blog clustering algorithm based on revision by comments

Guo, Peng-Wei (1); Gao, Ke-Ning (1); Zhang, Bin (1)   

  1. (1) School of Information Science and Engineering, Northeastern University, Shenyang 110004, China
  • Received:2013-06-20 Revised:2013-06-20 Online:2010-06-15 Published:2013-06-20
  • Contact: Zhang, B.
  • About author:-
  • Supported by:
    -

Abstract: Public blog clustering is an effective way to process blog information. A public blog clustering algorithm was therefore proposed, based on the revision by comments. Analyzing the information hierarchy of public blog, a public blog attribute model based on the general attributes of blog pages was developed as a basis on which the public blog was clustered. Then, after the initial clustering, the comments on the clustered public blog were taken in to revise the clustered blog. The clustered results were evaluated with entropy and purity, and two testing schemes were designed according to different ways of taking the comments in. One was making the comments on public blog participate in clustering process directly, the other was making use of the comments after clustering to play the role of revision. Testing results showed that, in most cases, the latter was more effective than the former.

CLC Number: