东北大学学报(自然科学版) ›› 2021, Vol. 42 ›› Issue (12): 1681-1687.DOI: 10.12068/j.issn.1005-3026.2021.12.002

• 信息与控制 • 上一篇    下一篇

基于图注意力网络的案件罪名预测方法:CP-GAT

赵琪珲1, 李大鹏2, 高天寒1, 闻英友3   

  1. (1. 东北大学 软件学院, 辽宁 沈阳110169; 2. 东北大学 计算机科学与工程学院, 辽宁 沈阳110169; 3. 东北大学 计算机科学与工程学院/东软研究院, 辽宁 沈阳110169)
  • 修回日期:2021-03-25 接受日期:2021-03-25 发布日期:2021-12-17
  • 通讯作者: 赵琪珲
  • 作者简介:赵琪珲(1992-),男,辽宁沈阳人,东北大学博士研究生; 高天寒(1978-),男,辽宁沈阳人,东北大学教授,博士生导师; 闻英友(1974-),男,辽宁沈阳人,东北大学教授,博士生导师.
  • 基金资助:
    国家重点研发计划项目(2018YFC0830601) .

A Charge Prediction Method Based on Graph Attention Network: CP-GAT

ZHAO Qi-hui1, LI Da-peng2, GAO Tian-han1, WEN Ying-you3   

  1. 1. Software College, Northeastern University, Shenyang 110169, China; 2. School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China; 3. School of Computer Science and Engineering /Neusoft Research Institute, Northeastern University, Shenyang 110169, China.
  • Revised:2021-03-25 Accepted:2021-03-25 Published:2021-12-17
  • Contact: GAO Tian-han
  • About author:-
  • Supported by:
    -

摘要: 案件罪名预测任务是基于文本数据去预测案件所属罪名.针对现有方法在相似罪名和长尾数据集上表现不佳的问题,提出了一种基于图注意力网络的案件罪名预测方法CP-GAT(charge prediction based on graph attention network).该方法首先使用司法文书数据集中的案例事件描述文本和案例对应的法条信息建立异质图结构数据,构建后的异质图包含两种类型的节点(词节点、案例节点),两种类型的边(词节点与词节点相连的边,词节点与案例节点相连的边).在基于法律文本构建后的异质图上使用图注意力网络进行图特征提取,最后将得到的特征向量输入到罪名预测的分类器中,得到案例所属的罪名.在CAIL2018法律数据集上的实验结果表明,基于图注意力网络的罪名预测方法优于对比实验使用的方法,准确率和宏观F1值分别达到了95.2%和66.1,验证了提出的方法有利于提升案件罪名预测任务的性能.

关键词: 图注意力网络;罪名预测;节点特征提取;异质图;法条信息

Abstract: The task of charge prediction is to predict the charge of a case based on text data. Aiming at the problem that the existing methods do not perform well on similar charges and long tail datasets, a case charge prediction method was proposed based on graph attention network(CP-GAT). Firstly, the case event description text in the judicial document data set and the corresponding legal information of the case are used to establish the heterogeneous graph structure data. The constructed heterogeneous graph contains two types of nodes(word nodes and case nodes), two types of edges(the edges connected by word nodes and word nodes, the edges connected by word nodes and case nodes). The graph attention network was used to extract graph features on the heterogeneous graph constructed based on texts, and finally the obtained feature vector was input into the classifier of charge prediction to get the charge of the case. The experimental results on the CAIL2018 legal dataset show that the charge prediction method based on graph attention network is better than the model used in the comparative experiment, and the accuracy and macro F1 value reach 95.2% and 66.1 respectively, which verifies that the proposed method is conducive to improving the performance of the case charge prediction task.

Key words: graph attention network; charge prediction; node feature extraction; heterogeneous graph; law article information

中图分类号: