东北大学学报(自然科学版) ›› 2008, Vol. 29 ›› Issue (7): 952-955.DOI: -

• 论著 • 上一篇    下一篇

基于相关系数的加权朴素贝叶斯分类算法

张明卫;王波;张斌;朱志良;   

  1. 东北大学信息与科学工程学院;东北大学信息与科学工程学院;东北大学信息与科学工程学院;东北大学软件学院 辽宁沈阳110004;东北大学软件学院;辽宁沈阳110004;辽宁沈阳110004;辽宁沈阳110004;辽宁沈阳110004
  • 收稿日期:2013-06-22 修回日期:2013-06-22 出版日期:2008-07-15 发布日期:2013-06-22
  • 通讯作者: Zhang, M.-W.
  • 作者简介:-
  • 基金资助:
    国家自然科学基金资助项目(60773218);;

Weighted naive Bayes classification algorithm based on correlation coefficients

Zhang, Ming-Wei (1); Wang, Bo (1); Zhang, Bin (1); Zhu, Zhi-Liang (2)   

  1. (1) School of Information Science and Engineering, Northeastern University, Shenyang 110004, China; (2) School of Software, Northeastern University, Shenyang 110004, China
  • Received:2013-06-22 Revised:2013-06-22 Online:2008-07-15 Published:2013-06-22
  • Contact: Zhang, M.-W.
  • About author:-
  • Supported by:
    -

摘要: 朴素贝叶斯分类算法的条件独立性假设在很少情况下能够满足,为了克服该问题,提出了一种基于相关系数的加权朴素贝叶斯分类模型.通过计算条件属性和决策属性之间的相关系数,对不同的条件属性赋予不同的权重,从而在保持简单性的基础上有效地提高了朴素贝叶斯算法的分类性能.首先给出了基于相关系数的属性权值求解方法,然后描述了相应的算法,并对算法原理进行了分析与证明.通过在中医小儿肺炎病例数据集和UCI数据集上的仿真实验,验证了该方法的有效性.

关键词: 数据挖掘, 分类算法, 朴素贝叶斯, 加权朴素贝叶斯, 相关系数

Abstract: Naive Bayes is based on an assumption of conditional independence and the assumption can scarcely be satisfied. A weighted naive Bayes classification algorithm based on correlation coefficients is proposed. By computing correlation coefficients between condition attributes and decision attribute, different condition attributes are weighted differently. Thereby, the classification performance can be improved effectively and simply. With a new method offered first to solve the weights of attributes on the basis of correlation coefficients discusses the operation principle of the algorithm, as well as its implementation. Simulation results of traditional China medicine paediatric pneumonia case data set and a variety of UCI data sets verify the effectiveness of this algorithm.

中图分类号: