东北大学学报:自然科学版 ›› 2016, Vol. 37 ›› Issue (10): 1388-1392.DOI: 10.12068/j.issn.1005-3026.2016.10.005

• 信息与控制 • 上一篇    下一篇

面向主属性值的类标特征分析

张明卫1, 张小旭2, 刘莹1, 韩春燕1   

  1. (1. 东北大学 软件学院, 辽宁 沈阳110169; 2. 浙江大学 计算机科学与技术学院, 浙江 杭州310058)
  • 收稿日期:2015-11-10 修回日期:2015-11-10 出版日期:2016-10-15 发布日期:2016-10-14
  • 通讯作者: 张明卫
  • 作者简介:张明卫(1979-),男,山东胶州人,东北大学讲师,博士.
  • 基金资助:
    国家自然科学基金资助项目(61100027, 61374178, 61202085, 61572117, 61572116); 中央高校基本科研业务费专项资金资助项目(N130417003).

Primary Value Oriented Class Label Characteristic Analysis

ZHANG Ming-wei1, ZHANG Xiao-xu2, LIU Ying1, HAN Chun-yan1   

  1. 1. School of Software, Northeastern University, Shenyang 110169, China; 2. College of Computer Science and Technology, Zhejiang University, Hangzhou 310058, China.
  • Received:2015-11-10 Revised:2015-11-10 Online:2016-10-15 Published:2016-10-14
  • Contact: ZHANG Ming-wei
  • About author:-
  • Supported by:
    -

摘要: 为了提取一个类标区别于其他类标的本质特征,增强类标数据集的可解释性,提出了一种面向主属性值的类标特征分析方法.该方法首先建立了一种直观的面向主属性值的类标特征模型,然后设计了对应的类标特征抽取算法,最后给出了一种基于类标特征分析的分类算法.实验结果表明:所建立的类标特征模型能够直观、有效地描述类标数据集中各类标的特征,给出的类标特征抽取算法有较高的执行性能,提出的分类算法在针对类标较少的数据集时有较高的分类准确率.

关键词: 数据挖掘, 分类, 聚类, 类标特征, 主属性值

Abstract: A primary value oriented class label characteristic analyzing approach was proposed to extract the essential characteristics of one class label distinguishing with the others. In addition, the interpretability of label datasets could also be improved by this proposed method. Firstly, an intuitive primary value oriented class label characteristic model was built. Then, the corresponding class label characteristic extracting algorithm was designed. Finally, a classification algorithm was presented based on class label characteristic analysis. Experimental results demonstrated that the class label characteristic model can describe the characteristics of each class label for label datasets intuitively and effectively, and the given class label characteristic extracting algorithm has high execution performance. What’s more, the proposed classification algorithm has relatively high accuracy for datasets with fewer class labels.

Key words: data mining, classification, clustering, class label characteristic, primary value

中图分类号: