东北大学学报(自然科学版) ›› 2022, Vol. 43 ›› Issue (10): 1405-1412.DOI: 10.12068/j.issn.1005-3026.2022.10.006

• 信息与控制 • 上一篇    下一篇

一种基于目标函数的局部离群点检测方法

周玉, 朱文豪, 孙红玉   

  1. (华北水利水电大学 电力学院, 河南 郑州450011)
  • 修回日期:2021-08-27 接受日期:2021-08-27 发布日期:2022-11-07
  • 通讯作者: 周玉
  • 作者简介:周玉(1979-),男,安徽枞阳人,华北水利电力大学副教授,博士.
  • 基金资助:
    河南省高等学校青年骨干教师培养计划项目(2018GGJS079); 国家自然科学基金资助项目(U1504622).

A Local Outlier Detection Method Based on Objective Function

ZHOU Yu, ZHU Wen-hao, SUN Hong-yu   

  1. School of Electric Power, North China University of Water Resources and Electric Power, Zhengzhou 450011, China.
  • Revised:2021-08-27 Accepted:2021-08-27 Published:2022-11-07
  • Contact: ZHOU Yu
  • About author:-
  • Supported by:
    -

摘要: 针对传统的基于密度的局部离群点检测算法对原始数据集没有进行预处理导致该算法在面对未知数据集时检测效果不理想,又由于其需要计算每一个数据点的离群因子,在数据量过多时,计算量大大增加的问题,通过对局部离群点检测算法的分析,提出了一种基于目标函数的局部离群点检测方法FOLOF(FCM objective function-based LOF).首先,使用肘部法则确定数据集的最佳聚类个数;然后,通过FCM的目标函数对数据集进行剪枝,得到离群点候选集;最后,利用加权局部离群因子检测算法计算候选集中每个点的离群程度.利用该方法在人工数据集和UCI数据集上进行了相关实验,并与其他相关方法进行了对比,结果显示,该算法能够提高离群点检测精度,减少计算量,有效提高离群点检测性能.

关键词: 离群点检测;模糊C均值算法;目标函数;局部离群因子;剪枝

Abstract: The traditional density based local outlier detection algorithm does not preprocess the original data set, which leads to the unsatisfactory detection effect when facing the unknown data set. Moreover, due to the need to calculate the outlier factor of each data point, the amount of calculation increases greatly when the amount of data is too large. Through the analysis of the local outlier detection algorithm, a local outlier detection method based on objective function FOLOF(FCM objective function-based LOF)is proposed. Firstly, the elbow rule is used to determine the optimal number of clusters in the data set.Then, the data set is pruned by the objective function of FCM to obtain the outlier candidate set.Finally, the weighted local outlier factor detection algorithm is used to calculate the outlier degree of each point in the candidate set. The relevant experiments are carried out on the artificial data set and UCI data sets. At the same time, the proposed method is compared with other methods.The results show that the proposed algorithm can improve the outlier detection accuracy, reduce the computational cost, and effectively achieve a better performance.

Key words: outlier detection; fuzzy C-means (FCM) algorithm; objective function; local outlier factor(LOF); pruning

中图分类号: