Journal of Northeastern University(Natural Science) ›› 2023, Vol. 44 ›› Issue (1): 33-39.DOI: 10.12068/j.issn.1005-3026.2023.01.005

• Information & Control • Previous Articles     Next Articles

Named Entity Recognition in Threat Intelligence Domain Based on Deep Learning

WANG Ying1,2, WANG Ze-hao3, LI Hong4, HUANG Wen-jun4   

  1. 1.Henan International Joint Laboratory of Theories and Key Technologies on Intelligence Networks, Henan University, Kaifeng 475001, China; 2.Subject Innovation and Intelligence Introduction Base of Henan Higher Educational Institution -Intelligent Information Processing Innovation and Intelligence Introduction Base of Henan University Software Engineering, Henan University, Kaifeng 475001, China; 3.Institute of Intelligence Networks System, Henan University, Kaifeng 475001, China; 4.Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100049, China.
  • Published:2023-01-30
  • Contact: HUANG Wen-jun
  • About author:-
  • Supported by:
    -

Abstract: In order to extract key information of threat intelligence from different sources and facilitate the government regulatory authorities to carry out security risk assessment, to reduce the difficulty identification caused by the serious mixing of Chinese and English threat intelligence texts and the lack of professional vocabulary, based on BiGRU-CRF model, a threat intelligence named entity recognition(NER)method integrating boundary features and iterated dilated convolution neural network (IDCNN) is proposed. Firstly, entities with clear boundaries, such as English words, are transformed according to the artificially constructed rule dictionary to reduce the loss of information easily caused by the model when processing long texts. The local feature information and the context global feature information are obtained through IDCNN and bidirectional gated recurrent unit (BiGRU), respectively. The results of experiments on threat intelligence corpus show that the proposed model is better than other models in relevant evaluation indexes, and the F-score reaches 87.4%.

Key words: threat intelligence; dilated convolution; named entity recognition (NER); information extraction; deep learning

CLC Number: