Abstract
Children's emotions expression concentrates in the acoustic aspects such as the tones and timbres of the voice instead of the semantics, and there are a lot of lengthy fragments in their speech. This paper proposes an emotion recognition model using the time series deep learning technology, named attention based Bi-directional Long Short-Term Memory (CNN-BiLSTM) to extract the emotional features. After preprocessing the speech signal, the forty-dimensional Mel Frequency Cepstral Coefficients (MFCC) related parameters are extracted, including the dynamic and static features. And these frequency domain features are enhanced by convolutional neural networks (CNNs) as the emotional features of children's speech recognition. BiLSTM is used to solve the problem of poor performance of long-term dependent learning features, and attention mechanism is used for only a few frames contain emotional features in the children speech signal. Compared with the related speech emotion recognition models such as LSTM-CNN and 2D-CNN-LSTM, our proposed speech emotion recognition model improves the accuracy up to 71.6% on the FAU-AIBO children's speech emotion database.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019 |
| Editors | Illhoi Yoo, Jinbo Bi, Xiaohua Tony Hu |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 1296-1300 |
| Number of pages | 5 |
| ISBN (Electronic) | 9781728118673 |
| DOIs | |
| State | Published - Nov 2019 |
| Event | 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019 - San Diego, United States Duration: 18 Nov 2019 → 21 Nov 2019 |
Publication series
| Name | Proceedings - 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019 |
|---|
Conference
| Conference | 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019 |
|---|---|
| Country/Territory | United States |
| City | San Diego |
| Period | 18/11/19 → 21/11/19 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Attention Mechanism
- BiLSTM
- Deep learning
- Emotion Recognition
- Speech Signal
Fingerprint
Dive into the research topics of 'Emotion Recognition from Children Speech Signals Using Attention Based Time Series Deep Learning'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver