
Journal of Northeastern University(Natural Science) ›› 2024, Vol. 45 ›› Issue (6): 793-801.DOI: 10.12068/j.issn.1005-3026.2024.06.005
• Information & Control • Previous Articles
Pei-yun XUE1,2(
), Jing BAI1, Nan ZHANG3, Jian-xing ZHAO1
Received:2023-05-23
Online:2024-06-15
Published:2024-09-18
Contact:
Pei-yun XUE
About author:XUE Pei-yun, E-mail: xuepeiyun@tyut.edu.cnCLC Number:
Pei-yun XUE, Jing BAI, Nan ZHANG, Jian-xing ZHAO. VMD Based Binary Channels Speech Feature Map Extraction Algorithm for Dysarthria[J]. Journal of Northeastern University(Natural Science), 2024, 45(6): 793-801.
| 语料名称 | 发音项 | 合计 |
|---|---|---|
计算机 命令单词 | ESCAPE,LEFT,LINE,PARAGRAPH,PASTE,RIGHT,SHIFT,SENTENCE,TAB,ALT,BACKSPACE, COMMAND,CONTROL,COPY,CUT,DELETE,DOWNWARD,ENTER,UPWARD | 19 |
| 数字单词 | ZERO,ONE,TWO,THREE,FOUR,FIVE,SIX,SEVEN,EIGHT,NINE | 10 |
Table 1 Pronunciation corpus of patients with dysarthria
| 语料名称 | 发音项 | 合计 |
|---|---|---|
计算机 命令单词 | ESCAPE,LEFT,LINE,PARAGRAPH,PASTE,RIGHT,SHIFT,SENTENCE,TAB,ALT,BACKSPACE, COMMAND,CONTROL,COPY,CUT,DELETE,DOWNWARD,ENTER,UPWARD | 19 |
| 数字单词 | ZERO,ONE,TWO,THREE,FOUR,FIVE,SIX,SEVEN,EIGHT,NINE | 10 |
| 特征 | WRA/% | ||
|---|---|---|---|
| 模型1 | 模型2 | 模型3 | |
| 特征1 | 83.74 | 84.45 | 85.47 |
| 特征2 | 85.94 | 86.49 | 87.82 |
| 特征3 | 91.52 | 92.14 | 93.48 |
| 特征4 | 92.69 | 93.24 | 94.34 |
| 特征5 | 69.91 | 70.78 | 71.88 |
Table 2 Recognition results of five features on
| 特征 | WRA/% | ||
|---|---|---|---|
| 模型1 | 模型2 | 模型3 | |
| 特征1 | 83.74 | 84.45 | 85.47 |
| 特征2 | 85.94 | 86.49 | 87.82 |
| 特征3 | 91.52 | 92.14 | 93.48 |
| 特征4 | 92.69 | 93.24 | 94.34 |
| 特征5 | 69.91 | 70.78 | 71.88 |
| 构音障碍患者 | 语音清晰度 水平/% | WRA/% | |||
|---|---|---|---|---|---|
| 特征1 | 特征2 | 特征3 | 特征4 | ||
| M04 | 2 | 71.76 | 77.65 | 85.88 | 87.06 |
| F03 | 6 | 79.31 | 84.48 | 89.66 | 91.38 |
| M12 | 7 | 74.12 | 76.47 | 88.24 | 89.41 |
| M01 | 15 | 70.59 | 69.12 | 83.82 | 85.29 |
| M07 | 28 | 88.37 | 90.70 | 95.35 | 93.02 |
| F02 | 29 | 84.75 | 89.83 | 94.92 | 94.92 |
| M06 | 39 | 86.59 | 86.59 | 93.90 | 95.12 |
| M16 | 43 | 87.80 | 82.93 | 91.46 | 93.90 |
| M05 | 58 | 90.24 | 92.68 | 92.68 | 95.12 |
| F04 | 62 | 85.71 | 87.30 | 93.65 | 95.24 |
| M11 | 62 | 86.42 | 88.89 | 91.36 | 92.59 |
| M09 | 86 | 84.30 | 91.74 | 95.04 | 95.87 |
| M14 | 90 | 92.77 | 93.98 | 95.18 | 96.39 |
| M08 | 93 | 89.34 | 90.98 | 94.26 | 95.90 |
| M10 | 93 | 93.10 | 93.97 | 96.55 | 97.41 |
Table 3 Recognition results of four features from 15
| 构音障碍患者 | 语音清晰度 水平/% | WRA/% | |||
|---|---|---|---|---|---|
| 特征1 | 特征2 | 特征3 | 特征4 | ||
| M04 | 2 | 71.76 | 77.65 | 85.88 | 87.06 |
| F03 | 6 | 79.31 | 84.48 | 89.66 | 91.38 |
| M12 | 7 | 74.12 | 76.47 | 88.24 | 89.41 |
| M01 | 15 | 70.59 | 69.12 | 83.82 | 85.29 |
| M07 | 28 | 88.37 | 90.70 | 95.35 | 93.02 |
| F02 | 29 | 84.75 | 89.83 | 94.92 | 94.92 |
| M06 | 39 | 86.59 | 86.59 | 93.90 | 95.12 |
| M16 | 43 | 87.80 | 82.93 | 91.46 | 93.90 |
| M05 | 58 | 90.24 | 92.68 | 92.68 | 95.12 |
| F04 | 62 | 85.71 | 87.30 | 93.65 | 95.24 |
| M11 | 62 | 86.42 | 88.89 | 91.36 | 92.59 |
| M09 | 86 | 84.30 | 91.74 | 95.04 | 95.87 |
| M14 | 90 | 92.77 | 93.98 | 95.18 | 96.39 |
| M08 | 93 | 89.34 | 90.98 | 94.26 | 95.90 |
| M10 | 93 | 93.10 | 93.97 | 96.55 | 97.41 |
| 构音障碍患者 | 语音清晰度水平/% | WRA/% | |||
|---|---|---|---|---|---|
| 特征1 | 特征2 | 特征3 | 特征4 | ||
| M04 | 2 | 72.94 | 78.82 | 87.06 | 88.24 |
| F03 | 6 | 81.03 | 86.20 | 91.38 | 93.10 |
| M12 | 7 | 75.29 | 77.65 | 89.41 | 90.59 |
| M01 | 15 | 70.59 | 70.59 | 85.29 | 85.29 |
| M07 | 28 | 90.24 | 91.86 | 96.51 | 94.19 |
| F02 | 29 | 86.44 | 91.52 | 96.61 | 96.61 |
| M06 | 39 | 87.80 | 89.02 | 95.12 | 96.34 |
| M16 | 43 | 89.02 | 84.15 | 93.90 | 95.12 |
| M05 | 58 | 91.46 | 93.90 | 93.90 | 95.12 |
| F04 | 62 | 87.30 | 88.89 | 95.24 | 96.83 |
| M11 | 62 | 87.65 | 90.12 | 92.59 | 93.83 |
| M09 | 86 | 84.30 | 92.56 | 95.87 | 96.69 |
| M14 | 90 | 92.77 | 95.18 | 96.39 | 97.59 |
| M08 | 93 | 89.34 | 91.80 | 95.08 | 96.72 |
| M10 | 93 | 93.97 | 94.83 | 97.41 | 98.28 |
Table 4 Recognition results of four features from 15
| 构音障碍患者 | 语音清晰度水平/% | WRA/% | |||
|---|---|---|---|---|---|
| 特征1 | 特征2 | 特征3 | 特征4 | ||
| M04 | 2 | 72.94 | 78.82 | 87.06 | 88.24 |
| F03 | 6 | 81.03 | 86.20 | 91.38 | 93.10 |
| M12 | 7 | 75.29 | 77.65 | 89.41 | 90.59 |
| M01 | 15 | 70.59 | 70.59 | 85.29 | 85.29 |
| M07 | 28 | 90.24 | 91.86 | 96.51 | 94.19 |
| F02 | 29 | 86.44 | 91.52 | 96.61 | 96.61 |
| M06 | 39 | 87.80 | 89.02 | 95.12 | 96.34 |
| M16 | 43 | 89.02 | 84.15 | 93.90 | 95.12 |
| M05 | 58 | 91.46 | 93.90 | 93.90 | 95.12 |
| F04 | 62 | 87.30 | 88.89 | 95.24 | 96.83 |
| M11 | 62 | 87.65 | 90.12 | 92.59 | 93.83 |
| M09 | 86 | 84.30 | 92.56 | 95.87 | 96.69 |
| M14 | 90 | 92.77 | 95.18 | 96.39 | 97.59 |
| M08 | 93 | 89.34 | 91.80 | 95.08 | 96.72 |
| M10 | 93 | 93.97 | 94.83 | 97.41 | 98.28 |
| 方法 | 特征参数 | WRA/% |
|---|---|---|
| ANN+MLP[ | MFCC | 68.88 |
| LL-SVM[ | MFCC | 87.91 |
| SV+S-CNN[ | 频谱图 | 89.54 |
| 特征2+模型3 | BCFbank特征 | 87.82 |
| 特征4+模型3 | MBCFbank特征图谱 | 94.34 |
Table 5 Comparison of the method with other
| 方法 | 特征参数 | WRA/% |
|---|---|---|
| ANN+MLP[ | MFCC | 68.88 |
| LL-SVM[ | MFCC | 87.91 |
| SV+S-CNN[ | 频谱图 | 89.54 |
| 特征2+模型3 | BCFbank特征 | 87.82 |
| 特征4+模型3 | MBCFbank特征图谱 | 94.34 |
| 1 | Mohammed S Y, Sid‑Ahmed S, Brahim‑Fares Z,et al.Improving dysarthric speech recognition using empirical mode decomposition and convolutional neural network[J].EURASIP Journal on Audio,Speech,and Music Processing,2020,2020(1):1-7. |
| 2 | Al‑Qatab B A, Mustafa M B.Classification of dysarthric speech according to the severity of impairment:an analysis of acoustic features[J].IEEE Access,2021(9):18183-18194. |
| 3 | Liu S, Hu S, Xie X,et al.Recent progress in the CUHK dysarthric speech recognition system[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2021,29(99):2267-2281. |
| 4 | Yue Z, Loweimi E, Christensen H,et al.Acoustic modelling from raw source and filter components for dysarthric speech recognition[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2022(30):2968-2980. |
| 5 | 梁正友,黎雨星,孙宇,等.基于多特征组合的构音障碍语音识别[J].计算机工程与设计,2022,43(2):567-572. |
| Liang Zheng‑you, Li Yu‑xing, Sun Yu,et al.Speech recognition with dysarthria based on multi‑feature combination[J].Computer Engineering and Design,2022,43(2):567-572. | |
| 6 | Jiao Y, Tu M, Berisha V,et al.Simulating dysarthric speech for training data augmentation in clinical speech applications[C]//2018 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP).Calgary Allcerta:IEEE,2018:6009-6013. |
| 7 | Yilmaz E, Mitra V, Sivaraman G,et al.Articulatory and bottleneck features for speaker‑independent ASR of dysarthric speech[J].Computer Speech & Language,2019,58:319-334. |
| 8 | Zaidi B F, Selouani S A, Boudraa M,et al.Deep neural network architectures for dysarthric speech analysis and recognition[J].Neural Computing and Applications,2021,33(15):9089-9108. |
| 9 | Mariya T A, Vijayalakshmi P, Nagarajan T.Data augmentation techniques for transfer learning‑based continuous dysarthric speech recognition[J].Circuits,Systems,and Signal Processing,2023,42(1):601-622. |
| 10 | 李东,张雪英,段淑斐,等.结合语音融合特征和随机森林的构音障碍识别[J].西安电子科技大学学报,2018,45(3):149-155. |
| Li Dong, Zhang Xue‑ying, Duan Shu‑fei,et al.Articulation disorder recognition based on speech fusion features and random forest[J].Journal of Xidian University,2018,45(3):149-155. | |
| 11 | 吴丽丹.基于深度时序网络的多视图构音障碍语音识别[D].上海:华东师范大学,2021. |
| Wu Li‑dan.Multi‑view articulation disorder speech recognition based on deep temporal network[D].Shanghai:East China Normal University,2021. | |
| 12 | 王赵国,韦存海,彭雅妮,等.基于GFCC-SVM-RFE的电力设备声音特征提取方法[J].电力信息与通信技术,2022,20(9):34-42. |
| Wang Zhao‑guo, Wei Cun‑hai, Peng Ya‑ni,et al.Sound feature extraction method of Power Equipment based on GFCC‑SVM‑RFE[J].Electric Power Information and Communication Technology,2022,20(9):34-42. | |
| 13 | Dragomiretskiy K, Zosso D.Variational mode decomposition[J].IEEE Transactions on Signal Processing,2014,62(3):531-544. |
| 14 | Fritsch J, Magimai‑Doss M.Utterance verification‑based dysarthric speech intelligibility assessment using phonetic posterior features[J].IEEE Signal Processing Letters,2021(28):224-228. |
| 15 | Shahamiri S R, Salim S.Artificial neural networks as speech recognisers for dysarthric speech:identifying the best‑performing set of MFCC parameters and studying a speaker‑independent approach[J].Advanced Engineering Informatics,2014,28(1):102-110. |
| 16 | Rajeswari N, Chandrakala S.Generative model‑driven feature learning for dysarthric speech recognition[J].Biocybernetics & Biomedical Engineering,2016,36(4):553-561. |
| 17 | Shahamiri S R.Speech vision:an end‑to‑end deep learning‑based dysarthric automatic speech recognition system[J].IEEE Transactions on Neural Systems and Rehabilitation Engineering,2021(29):852-861. |
| [1] | SUN Ying, LI Ze, ZHANG Xue-ying. Speech Emotion Recognition Based on Constrained Bi-channel Model [J]. Journal of Northeastern University(Natural Science), 2023, 44(11): 1537-1542. |
| [2] | LI Shou-tao, QU Ru-yi, ZHANG Yu, YU Ding-li. Freezing of Gait Recognition Method Based on Variational Mode Decomposition [J]. Journal of Northeastern University(Natural Science), 2023, 44(11): 1543-1548. |
| [3] | WANG Hong-min, CHAN Liang. Early Wear Diagnosis of Gears Based on Spectrum Correlation Analysis [J]. Journal of Northeastern University(Natural Science), 2023, 44(1): 18-25. |
| [4] | LIU Yang, YAN Dong-mei, MENG Fan-wei. Improved Two-Branch Person Re-identification Algorithm Based on Transformer [J]. Journal of Northeastern University(Natural Science), 2023, 44(1): 26-32. |
| [5] | MA Yuan-yuan, LIU Yan-ze, LIU Cheng-long, ZHANG Tian-jie. Chinese Investors’ Multi-perspective Sentiment Analysis and Its Role in Stock Market Forecasting [J]. Journal of Northeastern University(Natural Science), 2022, 43(8): 1201-1209. |
| [6] | FAN Chun-long, LI Yan-da, XIA Xiu-feng, QIAO Jian-zhong. A General Adversarial Attack Method Based on Random Gradient Ascent and Spherical Projection [J]. Journal of Northeastern University(Natural Science), 2022, 43(2): 168-175. |
| [7] | WANG Lu, WANG Shuai, ZHANG Guo-feng, XU Li-sheng. Pedestrian Detection Based on Semantic Segmentation Attention and Visible Region Prediction [J]. Journal of Northeastern University(Natural Science), 2021, 42(9): 1261-1267. |
| [8] | REN Zhao-hui, YU Tian-zhuang, DING Dong, ZHOU Shi-hua. Fault Diagnosis Method of Rolling Bearing Based on VMD-DBN [J]. Journal of Northeastern University(Natural Science), 2021, 42(8): 1105-1110. |
| [9] | HUANG Wei-qiang, ZHAO Yang, YAO Shuang. Tail Risk Spillover Effect Between Oil Market and Stock Market:Based on Variational Mode Decomposition and Dynamic Copula Function [J]. Journal of Northeastern University(Natural Science), 2021, 42(8): 1186-1193. |
| [10] | ZHANG Tao, LIU Tian-wei, DU Wen-li. A Convolutional Neural Network Based Local Dimming Technology [J]. Journal of Northeastern University(Natural Science), 2021, 42(5): 624-632. |
| [11] | ZHANG Yong-chao, LI Qi, REN Zhao-hui, ZHOU Shi-hua. Cross-Domain Fault Diagnosis of Rolling Bearings Using Domain Adaptation with Classifier Discrepancy [J]. Journal of Northeastern University(Natural Science), 2021, 42(3): 367-372. |
| [12] | JIANG Fang-fang, WANG Hao-qian, CHENG Tian-qing, HONG Chu-hang. Atrial Fibrillation Detection Method Based on Phase Space Reconstruction Using Ballistocardiogram Signal [J]. Journal of Northeastern University(Natural Science), 2021, 42(11): 1547-1554. |
| [13] | FANG Liang, ZHOU Yun, TANG Zhi-quan. Image Classification of Corroded Steel Reinforcement Based on Optimized Residual Network [J]. Journal of Northeastern University(Natural Science), 2021, 42(11): 1625-1633. |
| [14] | ZHANG Zhi-min, QIAO Jian-zhong, LIN Shu-kuan, WANG Pin-he. A View Reconstruction Method Based on Deep Network [J]. Journal of Northeastern University Natural Science, 2020, 41(8): 1065-1069. |
| [15] | XU Jiu-qiang, ZHANG Jin-peng, JIA Yu-qi, SHAO Jian-xin. Ensemble Learning Based Recognition Method for Bundle Branch Block [J]. Journal of Northeastern University Natural Science, 2020, 41(3): 321-326. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||