
Journal of Northeastern University(Natural Science) ›› 2026, Vol. 47 ›› Issue (1): 89-98.DOI: 10.12068/j.issn.1005-3026.2026.20240234
• Information & Control • Previous Articles Next Articles
Jun LIU1(
), Yue CAO1, Xiang-jun LIU2, Hong-yan WANG1
Received:2024-12-24
Online:2026-01-15
Published:2026-03-17
Contact:
Jun LIU
CLC Number:
Jun LIU, Yue CAO, Xiang-jun LIU, Hong-yan WANG. Research and Implementation of Knowledge Extraction in Aviation Accident Domain[J]. Journal of Northeastern University(Natural Science), 2026, 47(1): 89-98.
| 输入:句子样本集 |
|---|
| 输出:聚类划分后的句子集 |
| 开始: |
| 1.从数据集D中随机选取K个句子样本作为初始聚类中心: |
2.初始簇划分C为 3.对于样本集 M 中的每一个样本 4.计算句子向量 |
| 5.将样本分配给具有最小距离 |
| 6.为每个簇重新计算质心 |
| 7.若所有K个聚类中心的位置均未发生变化,或目标函数已收敛,或迭代次数已达到设定上限N,则跳转至步骤8 |
| 8.输出簇划分 |
Table 1 Implementation steps of clustering distant supervision method
| 输入:句子样本集 |
|---|
| 输出:聚类划分后的句子集 |
| 开始: |
| 1.从数据集D中随机选取K个句子样本作为初始聚类中心: |
2.初始簇划分C为 3.对于样本集 M 中的每一个样本 4.计算句子向量 |
| 5.将样本分配给具有最小距离 |
| 6.为每个簇重新计算质心 |
| 7.若所有K个聚类中心的位置均未发生变化,或目标函数已收敛,或迭代次数已达到设定上限N,则跳转至步骤8 |
| 8.输出簇划分 |
| 模型 | P | R | F1 |
|---|---|---|---|
| CRF | 83.68 | 84.76 | 84.22 |
| GRU | 85.48 | 86.62 | 86.05 |
| IDCNN | 83.23 | 79.56 | 81.35 |
| BiGRU | 86.56 | 87.23 | 86.89 |
| BiGRU-CRF | 88.80 | 87.16 | 87.97 |
| IDCNN-CRF | 87.39 | 81.64 | 84.41 |
| BERT-BiGRU-CRF | 92.20 | 91.00 | 91.59 |
| BERT-IDCNN-CRF | 91.80 | 88.10 | 89.91 |
| BERT-BiGRU-IDCNN-CRF | 93.15 | 90.39 | 91.74 |
| BERT-改进BiGRU-IDCNN-CRF | 94.69 | 91.72 | 93.18 |
Table 2 Comparison results of different named entity
| 模型 | P | R | F1 |
|---|---|---|---|
| CRF | 83.68 | 84.76 | 84.22 |
| GRU | 85.48 | 86.62 | 86.05 |
| IDCNN | 83.23 | 79.56 | 81.35 |
| BiGRU | 86.56 | 87.23 | 86.89 |
| BiGRU-CRF | 88.80 | 87.16 | 87.97 |
| IDCNN-CRF | 87.39 | 81.64 | 84.41 |
| BERT-BiGRU-CRF | 92.20 | 91.00 | 91.59 |
| BERT-IDCNN-CRF | 91.80 | 88.10 | 89.91 |
| BERT-BiGRU-IDCNN-CRF | 93.15 | 90.39 | 91.74 |
| BERT-改进BiGRU-IDCNN-CRF | 94.69 | 91.72 | 93.18 |
| 模型 | Micro_P | Micro_R | Micro_F1 |
|---|---|---|---|
| PCNN | 75.27 | 70.18 | 72.48 |
| ATT_Z+PCNN | 76.27 | 81.03 | 78.50 |
| PCNN+ATT_C | 80.12 | 78.86 | 79.48 |
| APCNNA | 81.59 | 83.26 | 82.42 |
Table 3 Comparison of different performance of models with and without attention mechanism %
| 模型 | Micro_P | Micro_R | Micro_F1 |
|---|---|---|---|
| PCNN | 75.27 | 70.18 | 72.48 |
| ATT_Z+PCNN | 76.27 | 81.03 | 78.50 |
| PCNN+ATT_C | 80.12 | 78.86 | 79.48 |
| APCNNA | 81.59 | 83.26 | 82.42 |
| 方法 | 传统远程监督 | 聚类远程监督 | ||||||
|---|---|---|---|---|---|---|---|---|
| Micro_P/% | Micro_R/% | Micro_F1/% | t/h | Micro_P/% | Micro_R/% | Micro_F1/% | t/h | |
| Mintz | 39.80 | 35.20 | 37.36 | 2 | 40.15 | 38.23 | 39.17 | 2.2 |
| MultiR | 54.20 | 40.15 | 46.13 | 1 | 50.16 | 48.74 | 49.44 | 1.2 |
| MIMLRE | 42.60 | 36.90 | 39.54 | 4 | 45.20 | 40.10 | 42.50 | 4.2 |
Table 4 Effect of clustering distant supervision method
| 方法 | 传统远程监督 | 聚类远程监督 | ||||||
|---|---|---|---|---|---|---|---|---|
| Micro_P/% | Micro_R/% | Micro_F1/% | t/h | Micro_P/% | Micro_R/% | Micro_F1/% | t/h | |
| Mintz | 39.80 | 35.20 | 37.36 | 2 | 40.15 | 38.23 | 39.17 | 2.2 |
| MultiR | 54.20 | 40.15 | 46.13 | 1 | 50.16 | 48.74 | 49.44 | 1.2 |
| MIMLRE | 42.60 | 36.90 | 39.54 | 4 | 45.20 | 40.10 | 42.50 | 4.2 |
| 模型 | Micro_P | Micro_R | Micro_F1 |
|---|---|---|---|
| MultiR | 50.16 | 48.74 | 49.44 |
| MIMLRE | 45.20 | 40.10 | 42.50 |
| BGWA | 69.46 | 71.24 | 70.34 |
| PCNN+ONE | 72.56 | 74.92 | 73.72 |
| PCNN | 75.27 | 70.18 | 72.48 |
| APCNNA | 81.59 | 83.26 | 82.42 |
| APCNNA+RL | 84.16 | 83.41 | 83.96 |
Table 5 Comparison results of different models
| 模型 | Micro_P | Micro_R | Micro_F1 |
|---|---|---|---|
| MultiR | 50.16 | 48.74 | 49.44 |
| MIMLRE | 45.20 | 40.10 | 42.50 |
| BGWA | 69.46 | 71.24 | 70.34 |
| PCNN+ONE | 72.56 | 74.92 | 73.72 |
| PCNN | 75.27 | 70.18 | 72.48 |
| APCNNA | 81.59 | 83.26 | 82.42 |
| APCNNA+RL | 84.16 | 83.41 | 83.96 |
| [1] | Pujara J, Miao H, Getoor L, et al. Knowledge graph identification[C]//The Semantic Web-ISWC 2013. Berlin, Heidelberg: Springer, 2013: 542-557. |
| [2] | 张汝佳, 代璐, 王邦, 等. 基于深度学习的中文命名实体识别最新研究进展综述[J]. 中文信息学报, 2022, 36(6): 20-35. |
| Zhang Ru-jia, Dai Lu, Wang Bang, et al. Recent advances of Chinese named entity recognition based on deep learning[J]. Journal of Chinese Information Processing, 2022, 36(6): 20-35. | |
| [3] | Goller C, Kuchler A. Learning task-dependent distributed representations by backpropagation through structure[C]//Proceedings of International Conference on Neural Networks (ICNN’96). Washington DC, 1996: 347-352. |
| [4] | Gers F A, Schmidhuber J, Cummins F. Learning to forget: continual prediction with LSTM[J]. Neural Computation, 2000, 12(10): 2451-2471. |
| [5] | Qin Q L, Zhao S, Liu C M. A BERT-BiGRU-CRF model for entity recognition of Chinese electronic medical records[J]. Complexity, 2021, 2021: 6631837. |
| [6] | Li Z R, Hu C M, Guo X H, et al. An unsupervised multiple-task and multiple-teacher model for cross-lingual named entity recognition[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Dublin, 2022: 170-179. |
| [7] | Boudjellal N, Zhang H P, Khan A, et al. ABioNER: a BERT-based model for Arabic biomedical named-entity recognition[J]. Complexity, 2021, 2021: 6633213. |
| [8] | Zhou R, Li X, Bing L D, et al. Improving self-training for cross-lingual named entity recognition with contrastive and prototype learning[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. Toronto, 2023: 4018-4031. |
| [9] | Ma J, Ballesteros M, Doss S, et al. Label semantics for few shot named entity recognition[C]//Findings of the Association for Computational Linguistics: ACL 2022. Dublin,2022: 1956-1971. |
| [10] | 鄂海红, 张文静, 肖思琪, 等 .深度学习实体关系抽取研究综述[J]. 软件学报, 2019, 30(6): 1793-1818. |
| Hai-hong E, Zhang Wen-jing, Xiao Si-qi, et al. Survey of entity relationship extraction based on deep learning [J]. Journal of Software,2019,30(6):1793-1818. | |
| [11] | He H, Ganjam K, Jain N, et al. An insight extraction system on BioMedical literature with deep neural networks[C]//Proceedings of the 2017 Conference on Empirical Methods in NaturalLanguage Processing. Copenhagen, 2017: 2691-2701. |
| [12] | Zhang M S, Zhang Y, Fu G H. End-to-end neural relation extraction with global optimization[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Copenhagen, 2017: 1730-1740. |
| [13] | Yang S Z, Liu Y X, Zhang K W, et al. Overview of remote supervision relationship extraction[J].Journal of Computer Science, 2021, 44(8):1636-1660. |
| [14] | 许鑫冉, 王腾宇, 鲁才. 图神经网络在知识图谱构建与应用中的研究进展[J]. 计算机科学与探索, 2023, 17(10): 2278-2299. |
| Xu Xin-ran, Wang Teng-yu, Lu Cai. Research progress of graph neural network in knowledge graph construction and application[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(10): 2278-2299. | |
| [15] | 郑志蕴,徐亚媚,李伦,等.融合位置特征注意力与关系增强机制的远程监督关系抽取[J]. 小型微型计算机系统, 2023, 44(12):2678-2684. |
| Zheng Zhi-yun, Xu Ya-mei, Li Lun, et al. Distantly supervised relation extraction with position feature attention and relation enhancement[J].Journal of Chinese Computer Systems, 2023, 44(12):2678-2684. | |
| [16] | Wang G Y, Zhang W, Wang R X, et al. Label-free distant supervision for relation extraction via knowledge graph embedding[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, 2018: 2246-2255. |
| [17] | Ji G L, Liu K, He S Z, et al. Distant supervision for relation extraction with sentence-level attention and entity descriptions[C]// Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. San Francisco, 2017: 3060-3066. |
| [18] | Chen T, Shi H Z, Tang S L, et al. CIL: contrastive instance learning framework for distantly supervised relation extraction[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics. Online, 2021: 6191-6200. |
| [19] | Luo B F, Feng Y S, Wang Z, et al. Learning with noise: enhance distantly supervised relation extraction with dynamic transition matrix[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouver,2017: 430-439. |
| [20] | Zhou Y R, Pan L M, Bai C Y, et al. Self-selective attention using correlation between instances for distant supervision relation extraction[J]. Neural Networks, 2021, 142: 213-220. |
| [21] | Mintz M, Bills S, Snow R, et al. Distant supervision for relation extraction without labeled data[C]//Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Singapore, 2009: 1003-1011. |
| [22] | Hoffmann R, Zhang C L, Ling X, et al. Knowledge-based weak supervision for information extraction of overlapping relations[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Portland, 2011: 541-550. |
| [23] | Surdeanu M, Tibshirani J, Nallapati R, et al. Multi-instance multi-label learning for relation extraction[C]//Conference on Empirical Methods in Natural Language Processing. Jeju Island, 2012:455-465. |
| [1] | Heng-xin PAN, Run-da JIA, Shu-lei ZHANG. Electric Vehicle Charging Scheduling Strategy Based on Safe Reinforcement Learning Algorithm [J]. Journal of Northeastern University(Natural Science), 2025, 46(5): 1-9. |
| [2] | Yi-xuan WANG, Jun LIU. Resource Adaptation Scheme for Beam-Hopping Satellite System Based on MASAC Maximum Entropy Reinforcement Learning [J]. Journal of Northeastern University(Natural Science), 2025, 46(2): 9-17. |
| [3] | Yang JIANG, Tian-xiang ZHAO, Ruo-huai SUN, Lei WANG. Risk-Oriented Crowd Navigation Strategy Based on Deep Reinforcement Learning [J]. Journal of Northeastern University(Natural Science), 2025, 46(12): 1-8. |
| [4] | Yu DAI, Zong-ming JING, Lei YANG, Zhen GAO. A Graph Reinforcement-Based Approach to Task Offloading and Resource Allocation in Partially Observable Environment [J]. Journal of Northeastern University(Natural Science), 2025, 46(1): 9-17. |
| [5] | Run-da JIA, Dong-hao ZHANG, Jun ZHENG, Kang LI. Application of Reinforcement Learning Based on Hybrid Model in Optimal Control of Flotation Process [J]. Journal of Northeastern University(Natural Science), 2024, 45(10): 1386-1393. |
| [6] | ZHAO Zhao, YUAN Pei-xin, TANG Jun-wen, CHEN Jin-lin. Agent Path Planning Algorithm Based on Improved SNN-HRL [J]. Journal of Northeastern University(Natural Science), 2023, 44(11): 1548-1555. |
| [7] | ZHANG Xue-feng, WANG Zhao-yi. Automatic Lane Change Decision Model Based on Dueling Double Deep Q-network [J]. Journal of Northeastern University(Natural Science), 2023, 44(10): 1369-1376. |
| [8] | WANG Ying, WANG Ze-hao, LI Hong, HUANG Wen-jun. Named Entity Recognition in Threat Intelligence Domain Based on Deep Learning [J]. Journal of Northeastern University(Natural Science), 2023, 44(1): 33-39. |
| [9] | LIU Jun, DAI Fu-cheng, XIN Ning. Virtual Machine Placement Strategy Based on Multi-objective Optimization [J]. Journal of Northeastern University(Natural Science), 2022, 43(5): 609-617. |
| [10] | MENG Lu, SHEN Ning, QI Yin-qiao, ZHANG Hao-yuan. Control Algorithm of Three-Dimensional Game Based on Reinforcement Learning [J]. Journal of Northeastern University(Natural Science), 2021, 42(4): 478-483. |
| [11] | CHEN Jian, HE Tao, WEN Ying-you, MA Lin-tao. Entity Recognition Method for Judicial Documents Based on BERT Model [J]. Journal of Northeastern University Natural Science, 2020, 41(10): 1382-1387. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||