Journal of Northeastern University(Natural Science) ›› 2024, Vol. 45 ›› Issue (6): 776-785.DOI: 10.12068/j.issn.1005-3026.2024.06.003

• Information & Control • Previous Articles    

Research on Emotion Recognition Method of Music Multimodal Data

Dong-hong HAN1, Yan-ru KONG2(), Yi-meng ZHAN1, Yuan LIU1   

  1. 1.School of Computer Science & Engineering,Northeastern University,Shenyang 110169,China
    2.NARI Group Corporation,State Grid Electric Power Research Institute,Nanjing 211000,China.
  • Received:2023-02-09 Online:2024-06-15 Published:2024-09-18
  • Contact: Yan-ru KONG
  • About author:KONG Yan-ru, Email: kong19960103@163.com

Abstract:

The research of music emotion recognition has broad application prospects in the fields of music intelligent recommendation and music visualization. Aiming at the problem that only using low?level audio features for emotion recognition has limited effectiveness and poor interpretability. Firstly, an emotion recognition model ERMSLM based on MIDI (musical instrument digital interface) data is constructed, which can learn the semantic information of notes. The features of this model are composed of melodic features extracted with skip?gram and LSTM(long short?term memory), tonal features extracted by pre?trained MLP and manually constructed features. Secondly, an emotion recognition model ERMBT based on text data that integrates lyrics and social tags is constructed. The lyrics features are composed of emotional features extracted with BERT, emotional dictionary features constructed by using ANEW lists and TF-IDF features of lyrics. Finally, two multimodal fusion models of feature?level fusion and decision?level fusion are constructed based on MIDI and text data. The experimental results show that the ERMSLM and ERMBT models can achieve accuracies of 56.93% and 72.62% respectively. And the decision?level multimodal fusion model is more effective.

Key words: music emotion recognition, deep learning, multimodal, LSTM

CLC Number: