A Multi-scale Graph Representation Learning Model Based on Electronic Health Records

doi:10.12068/j.issn.1005-3026.2026.20259019

Abstract

Abstract:

Existing graph representation learning methods for electronic health records （EHR） primarily rely on local information of a single patient， overlooking potential associations among patients in disease progression and treatment pathways. This limits the models’ generalizability and robustness. To address this issue， a hybrid multi-level graph neural network （H-MGNN） model was proposed， and it was applied to mortality prediction for intensive care unit （ICU） patients. The model constructed a patient-patient graph （P-P） at the macroscopic level and a taxonomy-note-word hypergraph （T-N-W） at the microscopic level， while incorporating temporal dependencies within the hypergraph to achieve multi-scale fusion of patient features. Meanwhile， a hybrid embedding （Hybrid-E） algorithm was designed to extract and integrate latent patient features and improve the prediction accuracy. Experimental results demonstrate that H-MGNN on the medical information mart for intensive care Ⅲ （MIMIC-Ⅲ） dataset significantly outperforms existing methods in terms of in-hospital mortality prediction and other tasks， validating its effectiveness and superiority in complex EHR data mining.

Key words: electronic health record, multi-scale, hypergraph, graph neural network

CLC Number:

TP 391.4

Jie-jie FAN, Xiao-juan BAN, Zhi-yan ZHANG. A Multi-scale Graph Representation Learning Model Based on Electronic Health Records[J]. Journal of Northeastern University(Natural Science), 2026, 47(1): 31-41.

Figures/Tables 7

References 58

[1]	Johnson A E W， Pollard T J， Shen L， et al. MIMIC-III， a freely accessible critical care database［J］. Scientific Data， 2016， 3： 160035.
[2]	Lipton Z C， Kale D C， Elkan C， et al. Learning to diagnose with LSTM recurrent neural networks［C］// International Conference on Learning Representations （ICLR）. San Juan， 2016：1-8.
[3]	Che Z P， Purushotham S， Cho K， et al. Recurrent neural networks for multivariate time series with missing values［J］. Scientific Reports， 2018， 8： 6085.
[4]	Malone B， Garcia-Duran A， Niepert M. Learning representations of missing data for predicting patient outcomes［EB/OL］. （2018-12-12）［2025-02-18］..
[5]	Xu Y B， Biswal S， Deshpande S R， et al. RAIM： recurrent attentive and intensive model of multimodal patient monitoring data［C］//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining （KDD ’18）. London， 2018： 2565-2573.
[6]	Ngo Q H， Kechadi T， Le-Khac N A. Domain specific entity recognition with semantic-based deep learning approach［J］. IEEE Access， 2021， 9： 152892-152902.
[7]	Rasmy L， Xiang Y， Xie Z Q， et al. Med-BERT： pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction［J］. NPJ Digital Medicine， 2021， 4： 86.
[8]	Lee J， Yoon W， Kim S， et al. BioBERT： a pre-trained biomedical language representation model for biomedical text mining［J］. Bioinformatics， 2020， 36（4）： 1234-1240.
[9]	Alsentzer E， Murphy J R， Boag W， et al. Publicly available clinical BERT embeddings［EB/OL］. （2019-04-06）［2025-01-10］. .
[10]	Vaswani A， Shazeer N， Parmar N， et al. Attention is all you need［J］. Advances in Neural Information Processing Systems， 2017， 30：6000-6010.
[11]	Huang K X， Altosaar J， Ranganath R. Clinical BERT： modeling clinical notes and predicting hospital readmission［EB/OL］. （2019-04-10）［2021-04-15］. .
[12]	Lewis P， Perez E， Piktus A， et al. Retrieval-augmented generation for knowledge-intensive NLP tasks［J］. Advances in Neural Information Processing Systems， 2020， 33： 9459-9474.
[13]	Song H， Rajan D， Thiagarajan J， et al. Attend and diagnose： clinical time series analysis using attention models［C］//AAAI Conference on Artificial Intelligence. New Orleans： AAAI Press， 2018： 4091-4098.
[14]	Hirszowicz O， Aran D. ICU bloodstream infection prediction： a transformer-based approach for EHR analysis［C］//Artificial Intelligence in Medicine. Cham： Springer， 2024： 279-292.
[15]	Li Y K， Rao S， Solares J R A， et al. BEHRT： transformer for electronic health records［J］. Scientific Reports， 2020， 10： 7155.
[16]	Pennington J， Socher R， Manning C. GLOVE： global vectors for word representation［C］// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing （EMNLP）. Doha： ACL， 2014： 1532-1543.
[17]	Tipirneni S， Reddy C K. Self-supervised transformer for sparse and irregularly sampled multivariate clinical time-series［J］. ACM Transactions on Knowledge Discovery from Data， 2022， 16（6）： 105. 1-105.17.
[18]	Kipf T N， Welling M. Semi-supervised classification with graph convolutional networks［EB/OL］. （2016-09-09）［2020-01-05］. .
[19]	Brody S， Yahav E， Levy O. Attentive neural processes［EB/OL］. （2021-01-17）［2022-01-05］. .
[20]	Liu X E， You X X， Zhang X， et al. Tensor graph convolutional networks for text classification［C］// Proceedings of the AAAI Conference on Artificial Intelligence. Philadelphia： AAAI Press， 2020： 8409-8416.
[21]	Yao L， Mao C S， Luo Y. Graph convolutional networks for text classification［C］// Proceedings of the AAAI Conference on Artificial Intelligence. Los Angeles： AAAI Press， 2019： 7370-7377.
[22]	Zhang Y F， Yu X L， Cui Z Y， et al. Every document owns its structure： inductive text classification via graph neural networks［EB/OL］. （2020-04-22）［2021-05-10］. .
[23]	Wang K Z， Han S C， Poon J. InducT-GCN： inductive graph convolutional networks for text classification［C］// 2022 26th International Conference on Pattern Recognition （ICPR）. Montreal： IEEE， 2022： 1243-1249.
[24]	Piao Y h， Lee S S， Lee D， et al. Sparse structure learning via graph neural networks for inductive document classification［C］//Processing of the AAAI Conference on Aritificial Intelligence.Vancouver，2022：11165-11173.
[25]	Ding K Z， Wang J L， Li J D， et al. Be more with less： hypergraph attention networks for inductive text classification［EB/OL］. （2020-11-01）［2023-05-10］. .
[26]	Zhang H P， Liu X， Zhang J W. HEGEL： hypergraph transformer for long document summarization［EB/OL］. （2022-08-09）［2023-05-10］. .
[27]	Park S， Bae S， Kim J， et al. Graph-text multi-modal pre-training for medical representation learning［C］// ACM Conference on Health， Inference， and Learning. Online， 2022： 261-281.
[28]	Zhang C H， Chu X， Ma L T， et al. M3Care： learning with missing modalities in multimodal healthcare data［C］// Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Washington DC， 2022： 2418-2428.
[29]	Xu Y X， Yang K， Zhang C H， et al. VecoCare： visit sequences-clinical notes joint learning for diagnosis prediction in healthcare data［C］// Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence. Macau， 2023： 4921-4929.
[30]	Chen D X， O’Bray L， Borgwardt K M. Structure-aware transformer for graph representation learning［C］// International Conference on Machine Learning. Online， 2022： 3469-3489.
[31]	Choi E， Bahadori M T， Song L， et al. GRAM： graph-based attention model for healthcare representation learning［C］// Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Halifax NS： ACM， 2017： 787-795.
[32]	Qiu L， Gorantla S， Rajan V， et al. Multi-disease predictive analytics： a clinical knowledge-aware approach［J］. ACM Transactions on Management Information Systems， 2021， 12（3）： 1-34.
[33]	Ma J T， Liu B， Li K L， et al. A review of graph neural networks and pretrained language models for knowledge graph reasoning［J］. Neurocomputing， 2024， 609： 128490.
[34]	Mo X， Ding G H， Tang R， et al. Bipartite graphs contrastive learning with knowledge-aware diffusion-enhanced［J］. IEEE Transaction Network Science and Engineering， 2025， 12（5）： 4182-4195.
[35]	Mishra R， Shridevi S. Knowledge graph driven medicine recommendation system using graph neural networks on longitudinal medical records［J］. Scientific Reports， 2024， 14： 25449.
[36]	Gaupp R， Dinius J， Drazic I， et al. Long-term effects of an e-learning course on patient safety： a controlled longitudinal study with medical students［J］. PLoS One， 2019， 14（1）： e0210947.
[37]	Gupta S， Sharma S， Sharma R， et al. Healing with hierarchy： hierarchical attention empowered graph neural networks for predictive analysis in medical data［J］. Artificial Intelligence in Medicine， 2025， 165： 103134.
[38]	Zhang D D， Yin C C， Zeng J C， et al. Combining structured and unstructured data for predictive models： a deep learning approach［J］. BMC Medical Informatics and Decision Making， 2020， 20： 280.
[39]	Gayathri R， Sangeetha S K B， Sangeetha R， et al. Dynamic AI-enhanced therapeutic framework for precision medicine using multi-modal data and patient-centric reinforcement learning［J］. IEEE Access， 2025， 13： 77709-77733.
[40]	Huang K X， Singh A， Chen S T， et al. Clinical XLNet： modeling sequential clinical notes and predicting prolonged mechanical ventilation［EB/OL］. （2019-12-27）［2020-10-10］. .
[41]	Hou L X， Zhuang Y， Xie Y H， et al. Cross-modal generalizable visual-language models via inter-modal bidirectional supervision for enhanced pathology image recognition［J］. Pattern Recognition， 2026， 171： 112240.
[42]	Hastuti R P， Rajagede R A， Zheng M， et al. Clinic-prompt： few-shot discrete clinical prompt optimization［C］//Workshop on Large Language Models and Generative AI for Health at AAAI 2025. Philadelphia， 2025：2451490.
[43]	Mulyar A， Schumacher E， Rouhizadeh M， et al. Phenotyping of clinical notes with improved document classification models using contextualized neural language models［EB/OL］. （2019-10-30）［2021-01-02］. .
[44]	Kruskal J B. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis［J］. Psychometrika， 1964， 29（1）： 1-27.
[45]	Haugen E， Firth J R. Papers in linguistics 1934—1951［J］. Language， 1958， 34（4）： 498-502.
[46]	Tenenbaum J B， de Silva V， Langford J C. A global geometric framework for nonlinear dimensionality reduction［J］. Science， 2000， 290（5500）： 2319-2323.
[47]	Roweis S T， Saul L K. Nonlinear dimensionality reduction by locally linear embedding［J］. Science， 2000， 290（5500）： 2323-2326.
[48]	Belkin M， Niyogi P. Laplacian eigenmaps and spectral techniques for embedding and clustering［C］// Advances in Neural Information Processing Systems 14. Cambridge， MA： MIT Press， 2002： 585-592.
[49]	Cao S S， Lu W， Xu Q K. GraRep： learning graph representations with global structural information［C］//Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. Melbourne， 2015： 891-900.
[50]	Ou M D， Cui P， Pei J， et al. Asymmetric transitivity preserving graph embedding［C］//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco： ACM， 2016： 1105-1114.
[51]	Perozzi B， Al-Rfou R， Skiena S. DeepWalk： online learning of social representations［C］//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York： ACM， 2014： 701-710.
[52]	Tang J， Qu M， Wang M Z， et al. LINE： large-scale information network embedding［EB/OL］. （2015-03-12）［2020-03-11］. .
[53]	Tang J， Qu M， Mei Q Z. PTE： predictive text embedding through large-scale heterogeneous text networks［C］//Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Sydney， NSW： ACM， 2015： 1165-1174.
[54]	Harutyunyan H， Khachatrian H， Kale D C， et al. Multitask learning and benchmarking with clinical time series data［J］. Scientific Data， 2019， 6： 96.
[55]	Kim N， Piao Y H， Kim S. Clinical note owns its hierarchy： multi-level hypergraph neural networks for patient-level representation learning［EB/OL］. （2023-05-16）［2025-02-20］. .
[56]	Zhou P， Shi W， Tian J， et al. Attention-based bidirectional long short-term memory networks for relation classification［C］//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin： ACL， 2016： 207-212.
[57]	Wang Z H， Yang B. Attention-based bidirectional long short-term memory networks for relation classification using knowledge distillation from BERT［C］// 2020 IEEE International Conference on Dependable， Autonomic and Secure Computing， International Conference on Pervasive Intelligence and Computing， international conference on Cloud and Big Data Computing， international conference on Cyber Science and Technology Congress （DASC/PiCom/CBDCom/CyberSciTech）. Calgary： IEEE， 2020： 562-568.
[58]	Joulin A， Grave E， Bojanowski P， et al. Bag of tricks for efficient text classification［EB/OL］. （2016-07-06）［2023-05-10］. .

类别	模型	整体		高血压		糖尿病
类别	模型	AUPRC	AUROC	AUPRC	AUROC	AUPRC	AUROC
字符	FastText	17.06±0.08	62.37±0.11	25.56±0.28	62.39±0.18	31.33±0.33	67.89±0.20
时序	Bi-LSTM	17.67±4.19	58.75±5.78	21.75±5.25	57.39±6.11	27.52±7.57	61.86±8.38
	Bi-LSTM w/Att	17.96±0.61	62.63±1.31	26.05±1.80	63.24±1.57	33.01±3.53	68.89±1.58
图	TextING	34.50±7.79	78.20±4.27	36.63±8.30	80.12±4.05	36.13±8.66	80.28±3.84
	InducT-GCN	43.03±1.96	82.23±0.72	41.06±2.95	85.56±1.24	40.59±3.07	84.42±1.45
超图	HyperGAT	44.42±1.96	84.00±0.84	42.32±1.78	86.41±1.01	40.08±2.45	85.03±1.20
	TM-HGNN*	46.15±1.43	84.45±0.62	44.50±1.66	87.15±0.34	41.24±1.72	85.68±1.05
本文算法	H-MGNN	49.80±0.54	86.35±0.30	48.67±0.72	87.95±0.45	44.62±0.80	87.85±1.16

类别	模型	整体		高血压		糖尿病
类别	模型	AUPRC	AUROC	AUPRC	AUROC	AUPRC	AUROC
字符	FastText	17.06±0.08	62.37±0.11	25.56±0.28	62.39±0.18	31.33±0.33	67.89±0.20
时序	Bi-LSTM	17.67±4.19	58.75±5.78	21.75±5.25	57.39±6.11	27.52±7.57	61.86±8.38
	Bi-LSTM w/Att	17.96±0.61	62.63±1.31	26.05±1.80	63.24±1.57	33.01±3.53	68.89±1.58
图	TextING	34.50±7.79	78.20±4.27	36.63±8.30	80.12±4.05	36.13±8.66	80.28±3.84
	InducT-GCN	43.03±1.96	82.23±0.72	41.06±2.95	85.56±1.24	40.59±3.07	84.42±1.45
超图	HyperGAT	44.42±1.96	84.00±0.84	42.32±1.78	86.41±1.01	40.08±2.45	85.03±1.20
	TM-HGNN*	46.15±1.43	84.45±0.62	44.50±1.66	87.15±0.34	41.24±1.72	85.68±1.05
本文算法	H-MGNN	49.80±0.54	86.35±0.30	48.67±0.72	87.95±0.45	44.62±0.80	87.85±1.16

消融操作	模型	整体		高血压
消融操作	模型	AUPRC	AUROC	AUPRC	AUROC
去除P-P模块	T-MGNN	46.27±1.05	84.90±0.84	44.65±0.60	87.45±0.67
去除P-P及时序	T-N-W	45.63±1.26	84.23±0.55	43.36±0.85	86.68±0.78
增加内部位置	TM-HGNN*	46.15±1.43	84.45±0.62	44.50±1.66	87.15±0.34
本文算法	H-MGNN	49.80±0.54	86.35±0.30	48.67±0.72	87.95±0.45

消融操作	模型	整体		高血压
消融操作	模型	AUPRC	AUROC	AUPRC	AUROC
去除P-P模块	T-MGNN	46.27±1.05	84.90±0.84	44.65±0.60	87.45±0.67
去除P-P及时序	T-N-W	45.63±1.26	84.23±0.55	43.36±0.85	86.68±0.78
增加内部位置	TM-HGNN*	46.15±1.43	84.45±0.62	44.50±1.66	87.15±0.34
本文算法	H-MGNN	49.80±0.54	86.35±0.30	48.67±0.72	87.95±0.45