东北大学学报:自然科学版 ›› 2020, Vol. 41 ›› Issue (10): 1382-1387.DOI: 10.12068/j.issn.1005-3026.2020.10.003

• 信息与控制 • 上一篇    下一篇

基于BERT模型的司法文书实体识别方法

陈剑, 何涛, 闻英友, 马林涛   

  1. (东北大学 计算机科学与工程学院/东软研究院, 辽宁 沈阳 110169)
  • 收稿日期:2020-02-28 修回日期:2020-02-28 出版日期:2020-10-15 发布日期:2020-10-20
  • 通讯作者: 陈剑
  • 作者简介:陈剑(1982-),男,辽宁沈阳人,东北大学副教授; 何涛(1981-),男,辽宁沈阳人,东北大学东软研究院高级工程师; 闻英友(1974-),男,辽宁沈阳人,东北大学教授,博士生导师.
  • 基金资助:
    国家重点研发计划项目(2018YFC0830601); 辽宁省重点研发计划项目(2019JH2/10100027); 中央高校基本科研业务费专项资金资助项目(N171802001); 辽宁省“兴辽英才计划”项目 (XLYC1802100).

Entity Recognition Method for Judicial Documents Based on BERT Model

CHEN Jian, HE Tao, WEN Ying-you, MA Lin-tao   

  1. School of Computer Science & Engineering/Neusoft Research Institute,Northeastern University,Shenyang 110169, China.
  • Received:2020-02-28 Revised:2020-02-28 Online:2020-10-15 Published:2020-10-20
  • Contact: HE Tao
  • About author:-
  • Supported by:
    -

摘要: 采用手工分析案件卷宗,容易产生案件实体遗漏现象及提取特征效率低下问题.为此,使用基于双向训练Transformer的编码器表征预训练模型.在手工标注的语料库中微调模型参数,再由长短时记忆网络与条件随机场对前一层输出的语义编码进行解码,完成实体抽取.该预训练模型具有巨大的参数量、强大的特征提取能力和实体的多维语义表征等优势,可有效提升实体抽取效果.实验结果表明,本文提出的模型能实现89%以上的实体提取准确度,显著优于传统的循环神经网络和卷积神经网络模型.

关键词: 深度学习, 预训练模型, 双向长短时记忆网络, 条件随机场, 命名实体识别

Abstract: Using manual analysis of case files, it is easy to cause the problem of case entity omission and low efficiency of feature extraction. Therefore, the bidirectional encoder representation from transformers pre-training model based on the traditional long short-term memory networks and conditional random fields was used to fine tune the model parameters on the manually labeled corpus for entity recognition. And then the semantic coding output from the previous layer was decoded by the long short-term memory networks and conditional random fields to complete entity extraction. The pre-training model has the advantages of huge parameters, powerful feature extraction ability and multi-dimensional semantic representation of entities, which can effectively improve the effect of entity extraction. The experimental results showed that the proposed model can achieve more than 89% entity extraction accuracy, which is significantly better than the traditional recurrent neural network and convolutional neural network model.

Key words: deep learning, pre-training model, bidirectional long short-term memory, conditional random field, named entity recognition

中图分类号: