东北大学学报(自然科学版) ›› 2023, Vol. 44 ›› Issue (9): 1251-1258.DOI: 10.12068/j.issn.1005-3026.2023.09.005

• 信息与控制 • 上一篇    下一篇

基于关键信息指导的文本摘要模型

林舟, 周绮凤   

  1. (厦门大学 航空航天学院, 福建 厦门361100)
  • 发布日期:2023-09-28
  • 通讯作者: 林舟
  • 作者简介:林舟(1997-),男,福建莆田人,厦门大学硕士研究生; 周绮凤(1976-),女,吉林长春人,厦门大学教授.
  • 基金资助:
    国家自然科学基金资助项目(62171391).

A Text Summarization Model Guided by Key Information

LIN Zhou, ZHOU Qi-feng   

  1. School of Aerospace Engineering, Xiamen University, Xiamen 361100, China.
  • Published:2023-09-28
  • Contact: ZHOU Qi-feng
  • About author:-
  • Supported by:
    -

摘要: 现有生成式文本摘要模型缺乏对关键词信息的关注,存在输入文本中关键信息丢失问题.因此,提出了一种基于关键词语义信息增强的指针生成网络(keyword semantic information enhancement pointer-generator networks, KSIE-PGN)模型.首先,构建了基于DistilBERT的关键词抽取模型(keywords selection method based on BERT, KSBERT).其次,提出了基于关键词掩码的覆盖机制,在使用覆盖机制时,保留解码过程中模型对关键词的持续关注.接着,KSIE-PGN模型在解码过程融合了多种关键词信息,包括关键词语义向量和关键词上下文向量,从而解决解码器丢失输入文本关键信息这一问题.在CNN/Daily Mail数据集上的实验结果表明KSIE-PGN模型能够较好地捕捉输入文本中的关键信息.

关键词: 生成式文本摘要;指针生成网络;关键词信息;关键词掩码;覆盖机制

Abstract: Existing abstractive text summarization models lack attention to keyword information, which leads to the loss of key information in the input text. A keyword semantic information enhancement pointer-generator networks, named KSIE-PGN, is proposed. Firstly, the keyword selection model KSBERT is built to extract keywords. Secondly, a keyword-masked coverage mechanism based on the information of keywords is proposed. When using the coverage mechanism, the continuous attention to keywords in the decoding process is retained. Then, the KSIE-PGN model integrates keyword information in the decoding process including the keyword semantic vector and the keyword context vector. Therefore, the decoder can avoid losing the key information in the input text. The experimental results on the CNN/Daily Mail dataset show that the model can capture the key information in the input text well.

Key words: abstractive text summarization; pointer generator network; information of keywords; keyword-masked; coverage mechanism

中图分类号: