SUN Ying, LI Ze, ZHANG Xue-ying. Speech Emotion Recognition Based on Constrained Bi-channel Model[J]. Journal of Northeastern University(Natural Science), 2023, 44(11): 1537-1542.
[1]Issa D,Demirci M F,Yazici A.Speech emotion recognition with deep convolutional neural networks[J].Biomedical Signal Processing and Control,2020,59:101894. [2]段俊毅,赵建峰.基于CNN的时频域语音情感识别的分析与对比[J].内蒙古师范大学学报(自然科学汉文版),2021,50(6):526-532.(Duan Jun-yi,Zhao Jian-feng.Analysis and comparison of speech emotion recognition in time-frequency domain based on CNN[J].Journal of Inner Mongolia Normal University(Natural Science Edition),2021,50(6):526-532.) [3]Tzinis E,Potamianos A.Segment-based speech emotion recognition using recurrent neural networks[C]//2017 Seventh International Conference on Affective Computing and Intelligent Interaction(ACII).New York:IEEE,2017:190-195. [4]焦亚萌,周成智,李文萍,等.融合多头注意力的VGGNet语音情感识别研究[J].国外电子测量技术,2022,41(1):63-69.(Jiao Ya-meng,Zhou Cheng-zhi,Li Wen-ping,et al.Study on voice emotional recognition with multi-headed attention in VGGNet[J].Foreigh Electronic Measurement Technology,2022,41(1):63-69.) [5]Xie Y,Liang R Y,Liang Z L,et al.Speech emotion classification using attention-based LSTM[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2019,27(11):1675-1685. [6]郑艳,陈家楠,吴凡,等.基于CGRU模型的语音情感识别研究与实现[J].东北大学学报(自然科学版),2020,41(12):1680-1685.(Zheng Yan,Chen Jia-nan,Wu Fan,et al.Research and implementation of speech emotion recognition based on CGRU model[J].Journal of Northeastern University(Natural Science),2020,41(12):1680-1685.) [7]Petridis S,Stafylakis T,Ma P C,et al.End-to-end audiovisual speech recognition[C]//2018 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).New York:IEEE,2018:6548-6552. [8]Vaswani A,Shazeer N,Parmar N,et al.Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems.New York:Curran Associates,2017:6000-6010. [9]Creswell A,White T,Dumoulin V,et al.Generative adversarial networks:an overview[J].IEEE Signal Processing Magazine,2018,35(1):53-65. [10]Busso C,Bulut M,Lee C C,et al.IEMOCAP:interactive emotional dyadic motion capture database[J].Language Resources and Evaluation,2008,42(4):335-359. [11]Xiong R B,Yang Y C,He D,et al.On layer normalization in the transformer architecture [EB/OL].(2020-06-09)[2021-11-26].https://arxiv.org/abs/2002.04745. [12]Bousmalis K,Trigeorgis G,SilbermannI N,et al.Domain separation networks[C]// Proceedings of the 30th International Conference on Neural Information Processing Systems.New York:ACM,2016:343-351.