Journal of Northeastern University Natural Science ›› 2020, Vol. 41 ›› Issue (10): 1382-1387.DOI: 10.12068/j.issn.1005-3026.2020.10.003

• Information & Control • Previous Articles     Next Articles

Entity Recognition Method for Judicial Documents Based on BERT Model

CHEN Jian, HE Tao, WEN Ying-you, MA Lin-tao   

  1. School of Computer Science & Engineering/Neusoft Research Institute,Northeastern University,Shenyang 110169, China.
  • Received:2020-02-28 Revised:2020-02-28 Online:2020-10-15 Published:2020-10-20
  • Contact: HE Tao
  • About author:-
  • Supported by:
    -

Abstract: Using manual analysis of case files, it is easy to cause the problem of case entity omission and low efficiency of feature extraction. Therefore, the bidirectional encoder representation from transformers pre-training model based on the traditional long short-term memory networks and conditional random fields was used to fine tune the model parameters on the manually labeled corpus for entity recognition. And then the semantic coding output from the previous layer was decoded by the long short-term memory networks and conditional random fields to complete entity extraction. The pre-training model has the advantages of huge parameters, powerful feature extraction ability and multi-dimensional semantic representation of entities, which can effectively improve the effect of entity extraction. The experimental results showed that the proposed model can achieve more than 89% entity extraction accuracy, which is significantly better than the traditional recurrent neural network and convolutional neural network model.

Key words: deep learning, pre-training model, bidirectional long short-term memory, conditional random field, named entity recognition

CLC Number: