东北大学学报(自然科学版) ›› 2023, Vol. 44 ›› Issue (9): 1234-1244.DOI: 10.12068/j.issn.1005-3026.2023.09.003

• 信息与控制 • 上一篇    下一篇

一种NMI结合HSIC0的台风移动轨迹多因素相关分析方法

乔百友1, 郝元卿1, 唐忠1, 汪锐2   

  1. (1. 东北大学 计算机科学与工程学院, 辽宁 沈阳110819; 2. 中国科学院 沈阳自动化研究所, 辽宁 沈阳110169)
  • 发布日期:2023-09-28
  • 通讯作者: 乔百友
  • 作者简介:乔百友(1970-),男,甘肃礼县人,东北大学副教授.
  • 基金资助:
    国家重点研发计划项目(2019YFB1405302); 国家自然科学基金资助项目(61872072).

A Multi-factor Correlation Analysis Method for Typhoon Moving Track Based on NMI and HSIC0

QIAO Bai-you1, HAO Yuan-qing1, TANG Zhong1, WANG Rui2   

  1. 1. School of Computer Science & Engineering, Northeastern University, Shenyang 110819, China; 2. Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110169, China.
  • Published:2023-09-28
  • Contact: QIAO Bai-you
  • About author:-
  • Supported by:
    -

摘要: 现有非线性相关分析方法准确性低、计算代价大,因而不适合大规模、高维度台风轨迹数据相关分析.针对这一问题,首次将希尔伯特-施密特独立准则经验估计(Hilbert-Schmidt independent criterion empirical estimation, HSIC0)引入到台风移动轨迹相关研究中,提出了一种基于标准化互信息(normalized mutual information,NMI)结合HSIC0的多因素相关分析方法.该方法首先利用NMI来过滤掉台风数据中相关性低的冗余因素,然后采用XGBoost机器学习模型来剔除掉无效因素,从而降低后续计算代价.在此基础上,采用基于HSIC0的多因素相关分析方法对台风数据进行了分析,挖掘出了相关性较强的台风移动轨迹影响因素组合,从而提高了台风移动轨迹的预测精度.真实台风数据集上的一系列实验结果表明,提出的方法在MSE,MAE,R2指标上均优于NMI、Pearson相关系数、距离相关系数等分析方法.

关键词: 台风移动轨迹;相关分析;多因素;HSIC0;XGBoost

Abstract: The existing nonlinear correlation analysis methods have low accuracy and high computational cost, making them unsuitable for the correlation analysis of large-scale and high-dimensional typhoon track data.To solve this problem,the Hilbert-Schmidt independent criterion empirical estimation (HSIC0) is introduced into typhoon track correlation study for the first time, and a multi-factor correlation analysis method based on normalized mutual information (NMI) and HSIC0 is proposed. The method first uses NMI to filter out redundant factors with low correlation in typhoon data, and then uses XGBoost to eliminate invalid factors, thus reducing the subsequent computational costs.On this basis, a multi-factor correlation analysis method based on HSIC0 is used to analyze typhoon data, and a combination of factors affecting typhoon moving track with strong correlation is mined, thereby improving the prediction accuracy of typhoon moving track.A series of experimental results on real typhoon data sets show that the proposed method outperforms the correlation analysis methods such as NMI, Pearson correlation coefficient, and distance correlation coefficient in indicators such as MSE, MAE, R2.

Key words: typhoon moving track; correlation analysis; multi-factor; HSIC0; XGBoost

中图分类号: