东北大学学报(自然科学版) ›› 2024, Vol. 45 ›› Issue (1): 40-48.DOI: 10.12068/j.issn.1005-3026.2024.01.006

• 信息与控制 • 上一篇    下一篇

融合功能性副语言比例系数的语音情感识别

孙颖, 周雅茹, 张雪英   

  1. 太原理工大学 信息与计算机学院,山西 太原 030024
  • 收稿日期:2022-07-22 出版日期:2024-01-15 发布日期:2024-04-02
  • 作者简介:孙 颖(1981-),女,山西太原人,太原理工大学副教授,博士
    张雪英(1964-),女,河北行唐人,太原理工大学教授,博士生导师.
  • 基金资助:
    国家自然科学基金资助项目(62271342);山西省自然科学基金资助项目(201901D111096)

Speech Emotion Recognition Fusing Functional Paralanguage Proportion Coefficient

Ying SUN, Ya-ru ZHOU, Xue-ying ZHANG   

  1. College of Information and Computer,Taiyuan University of Technology,Taiyuan 030024,China. Corresponding author: ZHANG Xue-ying,E-mail: tyzhangxy@163. com
  • Received:2022-07-22 Online:2024-01-15 Published:2024-04-02

摘要:

语言中的非言语发声如笑声、叹息、抽泣等,称为功能性副语言,对情感表达起重要作用,但现有研究很少考虑多种功能性副语言在一种情感中的协同作用.针对该问题,提出了融合功能性副语言比例系数(functional paralanguage proportion coefficient,FPPC)的情感识别系统.首先,提取能体现多种功能性副语言在情感语句中出现的频率快慢和持续时间长短的FPPC特征;然后,搭建基于注意力机制的集成学习(attention stacking) 为不同的基分类器赋予不同权重,并对FPPC特征进行训练;最后,通过自适应熵权重决策融合方法将传统语音情感识别与基于FPPC特征情感识别进行融合.实验结果显示,融合了FPPC特征后的情感识别结果提高了16.84%,证明融合FPPC特征能有效提高系统整体识别率.

关键词: 语音情感识别, 比例系数, 功能性副语言, 注意力机制, 自适应熵权重决策融合

Abstract:

Nonverbal vocalizations such as laughter, sighs, and sobs in speech are called functional paralanguage and play an important role in emotional expression. However, existing research has rarely considered the synergistic effect of multiple functional paralanguages in a single emotion. To address this issue, an emotion recognition system integrating functional paralanguage proportion coefficients (FPPC) is proposed. Firstly, FPPC features that reflect the frequency and duration of multiple functional paralanguages appearing in emotional statements are extracted. Then, an attention mechanism-based ensemble learning is constructed to assign different weights to different base classifiers and train the FPPC features. Finally, the adaptive entropy weight decision fusion method is used to fuse traditional speech emotion recognition with emotion recognition based on FPPC features. Experimental results show a 16.84% improvement in emotion recognition after integrating FPPC features, proving that integrating FPPC features can effectively improve the overall recognition rate of the system.

Key words: speech emotion recognition, proportion coefficient, functional paralanguage, attention mechanism, adaptive entropy weight decision fusion

中图分类号: