Emotion Recognition from Children Speech Signals Using Attention Based Time Series Deep Learning

  • Guitao Cao
  • , Yunming Tang
  • , Jiyu Sheng
  • , Wenming Cao*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

Children's emotions expression concentrates in the acoustic aspects such as the tones and timbres of the voice instead of the semantics, and there are a lot of lengthy fragments in their speech. This paper proposes an emotion recognition model using the time series deep learning technology, named attention based Bi-directional Long Short-Term Memory (CNN-BiLSTM) to extract the emotional features. After preprocessing the speech signal, the forty-dimensional Mel Frequency Cepstral Coefficients (MFCC) related parameters are extracted, including the dynamic and static features. And these frequency domain features are enhanced by convolutional neural networks (CNNs) as the emotional features of children's speech recognition. BiLSTM is used to solve the problem of poor performance of long-term dependent learning features, and attention mechanism is used for only a few frames contain emotional features in the children speech signal. Compared with the related speech emotion recognition models such as LSTM-CNN and 2D-CNN-LSTM, our proposed speech emotion recognition model improves the accuracy up to 71.6% on the FAU-AIBO children's speech emotion database.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019
EditorsIllhoi Yoo, Jinbo Bi, Xiaohua Tony Hu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1296-1300
Number of pages5
ISBN (Electronic)9781728118673
DOIs
StatePublished - Nov 2019
Event2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019 - San Diego, United States
Duration: 18 Nov 201921 Nov 2019

Publication series

NameProceedings - 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019

Conference

Conference2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019
Country/TerritoryUnited States
CitySan Diego
Period18/11/1921/11/19

Keywords

  • Attention Mechanism
  • BiLSTM
  • Deep learning
  • Emotion Recognition
  • Speech Signal

Fingerprint

Dive into the research topics of 'Emotion Recognition from Children Speech Signals Using Attention Based Time Series Deep Learning'. Together they form a unique fingerprint.

Cite this