The effects of normalisation methods on speech emotion recognition

Sefara, Tshephisho J

The effects of normalisation methods on speech emotion recognition

https://ieeexplore.ieee.org/document/9015895
DOI: 10.1109/IMITEC45504.2019.9015895
http://hdl.handle.net/10204/11330

Abstract:

Speech emotion recognition systems require features to be extracted from the speech signal. These features include Time, Frequency, and Cepstral-domain features. To normalise features, it is a challenging task to select an appropriate normalisation algorithm since the algorithm may impact classification accuracy. This paper presents the effects of different normalisation methods applied to speech features for speech emotion recognition. Speech features are extracted from the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset and normalised before training machine and deep learning algorithms such as Logistic Regression, Support Vector Machine, Multilayer Perceptron, Convolutional Neural Network (CNN), and Long Short-Term Memory (LSTM). The CNNs and LSTMs obtained 72% for both accuracy and F1score outperforming standard machine learning algorithms. Feature normalisation improved both accuracy and F1score by more than 14% using CNN and LSTM.

Reference:

Sefara, T.J. 2019. The effects of normalisation methods on speech emotion recognition. In: IEEE International Multidisciplinary Information Technology and Engineering Conference (IMITEC) 2019, Vanderbijlpark, South Africa, 21-22 November 2019

Sefara, T. J. (2019). The effects of normalisation methods on speech emotion recognition. http://hdl.handle.net/10204/11330

Sefara, Tshephisho J. "The effects of normalisation methods on speech emotion recognition." (2019): http://hdl.handle.net/10204/11330

Sefara TJ, The effects of normalisation methods on speech emotion recognition; 2019. http://hdl.handle.net/10204/11330 .

Download RIS

Presented at: IEEE International Multidisciplinary Information Technology and Engineering Conference (IMITEC) 2019, Vanderbijlpark, South Africa, 21-22 November 2019. This is the accepted version of the published item.

Sefara, Tshephisho J

Nov 2019

Machine learning
Neural networks
Emotion recognition
Normalisation method
Speech emotions

Show full item record

Files in this item

Sefara_2019.pdf

This item appears in the following Collection(s)

Conference Publications

Browse

All of ResearchSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects
- Publication Type
- Cluster
- Impact Area

Quick Links

Legislation and compliance

General Enquiries

Tel: + 27 12 841 2911
Email: callcentre@csir.co.za

Physical Address
Meiring Naudé Road
Brummeria
Pretoria
South Africa

Postal Address
PO Box 395
Pretoria 0001
South Africa

Social Connect

Resources on this site are free to download and reuse according to associated licensing provision. Please read the terms and conditions of usage of each resource.

The effects of normalisation methods on speech emotion recognition

The effects of normalisation methods on speech emotion recognition

This item appears in the following Collection(s)

Browse

All of ResearchSpace

This Collection

Quick Links

Legislation and compliance

General Enquiries

Social Connect