东北大学学报:自然科学版 ›› 2017, Vol. 38 ›› Issue (8): 1065-1069.DOI: 10.12068/j.issn.1005-3026.2017.08.001

• 信息与控制 •    下一篇

一种基于黑洞算法的模糊C均值文本聚类方法

柳玉辉1, 王伟超1, 孟磊2   

  1. (1. 东北大学 计算机科学与工程学院, 辽宁 沈阳110169; 2. 东网科技有限公司, 辽宁 沈阳110169)
  • 收稿日期:2016-03-14 修回日期:2016-03-14 出版日期:2017-08-15 发布日期:2017-08-12
  • 通讯作者: 柳玉辉
  • 作者简介:柳玉辉(1966-),男,山东烟台人,东北大学副教授.
  • 基金资助:
    国家高技术研究发展计划项目(2015AA016005).

Document Clustering of Fuzzy C-Means Based on Black Hole Algorithm

LIU Yu-hui1, WANG Wei-chao1, MENG Lei2   

  1. 1. School of Computer Science & Engineering, Northeastern University, Shenyang 110169, China; 2. Neunn Technology Co., Ltd, Shenyang 110169, China.
  • Received:2016-03-14 Revised:2016-03-14 Online:2017-08-15 Published:2017-08-12
  • Contact: LIU Yu-hui
  • About author:-
  • Supported by:
    -

摘要: FCM算法应用于文本聚类时,由于初始聚类中心点选择的随机性,以及容易陷入局部最优的问题,导致文本聚类效果较差.为了提高FCM算法的聚类精度,提出了采用黑洞算法寻找FCM最优初始聚类中心的方法.黑洞算法是一种启发式优化方法,在FCM初始聚类中心寻优的过程中,始终保持黑洞为全局最优解,最终发现FCM的最优初始聚类中心.实验结果表明,基于黑洞算法的FCM文本聚类方法可以解决FCM算法对初始中心点敏感和容易陷入局部最优的问题,聚类精度明显提高.

关键词: 模糊C均值, 黑洞算法, 文本聚类, 参数搜索, 初始聚类中心

Abstract: When fuzzy c-means (FCM) algorithm is applied to document clustering, the result is not ideal because of its initial cluster center points’ random selection and falling into the local optimal solution easily. Aiming at improving the FCM’s clustering accuracy, a method is proposed which uses the black hole algorithm (BHA), a heuristic algorithm, to find FCM’s optimal initial clustering centers. During searching for the FCM’s best initial clustering centers, the black hole is considered as the optimal option, and the FCM’s best initial clustering centers can be found. The experiment’s results show that the document clustering of FCM based on black hole algorithm can solve the problem that FCM is sensitive to initial centers and easy to fall into the local optimal solution, and finally, the clustering accuracy is improved significantly.

Key words: fuzzy c-means, black hole algorithm, document clustering, parameter searching, initial clustering center

中图分类号: