东北大学学报(自然科学版) ›› 2011, Vol. 32 ›› Issue (6): 761-764.DOI: -

• 论著 •    下一篇

基于改进Fast-MCD的稳健数据预处理方法

王魏;赵立杰;柴天佑;   

  1. 东北大学流程工业综合自动化国家重点实验室;
  • 收稿日期:2013-06-19 修回日期:2013-06-19 发布日期:2013-04-04
  • 通讯作者: -
  • 作者简介:-
  • 基金资助:
    国家自然科学基金资助项目(61020106003);;

An improved robust data preprocessing method based on Fast-MCD algorithm

Wang, Wei (1); Zhao, Li-Jie (1); Chai, Tian-You (1)   

  1. (1) State Key Laboratory of Integrated Automation for Process Industry, Northeastern University, Shenyang 110819, China
  • Received:2013-06-19 Revised:2013-06-19 Published:2013-04-04
  • Contact: Wang, W.
  • About author:-
  • Supported by:
    -

摘要: 对于工业过程数据中的离群点,一般采用稳健估计技术处理.针对Fast-MCD算法中初值随机给定,以及当样本数据较大时,人为给定分堆个数的缺点,提出了一种基于模糊聚类的改进稳健估计算法,即采用聚类中心及聚类个数分别作为Fast-MCD算法的初值及分堆个数选择依据,从而提高计算效率,并使样本数据较大时的分堆计算更合理.将本方法用于分析铝酸钠溶液的温度电导建模数据,实现了离群点的辨识,可以消除不规则数据对软测量建模的不合理影响.与Fast-MCD方法相比,它收敛速度快,计算效率高.

关键词: 稳健估计, 软测量, 离群点, 模糊聚类, 马氏距离

Abstract: Robust estimation is usually used for dealing with the outliers in the industry process data. A new robust estimation method is proposed for improving the Fast-MCD algorithm which has random starting value and artificial value of subsection. Fuzzy clustering is adopted to improve the computing efficiency in this method and the clustering center and clustering number are used to replace the starting value and subsection value. This method is implemented to analyze the temperature and conductivity data of sodium aluminate solution, and the simulation results show the proposed method can realize the identification of outliers. It also can reduce the unreasonable influence of outliers to soft sensing. Compared to Fast-MCD, it has the merits such as rapid convergence and high efficiency.

中图分类号: