The effects of normalisation methods on speech emotion recognition

Sefara, Tshephisho J

dc.contributor.author	Sefara, Tshephisho J
dc.date.accessioned	2020-03-17T13:08:55Z
dc.date.available	2020-03-17T13:08:55Z
dc.date.issued	2019-11
dc.identifier.citation	Sefara, T.J. 2019. The effects of normalisation methods on speech emotion recognition. In: IEEE International Multidisciplinary Information Technology and Engineering Conference (IMITEC) 2019, Vanderbijlpark, South Africa, 21-22 November 2019	en_US
dc.identifier.isbn	978-1-7281-0040-1
dc.identifier.isbn	978-1-7281-0041-8
dc.identifier.uri	https://ieeexplore.ieee.org/document/9015895
dc.identifier.uri	DOI: 10.1109/IMITEC45504.2019.9015895
dc.identifier.uri	http://hdl.handle.net/10204/11330
dc.description	Presented at: IEEE International Multidisciplinary Information Technology and Engineering Conference (IMITEC) 2019, Vanderbijlpark, South Africa, 21-22 November 2019. This is the accepted version of the published item.	en_US
dc.description.abstract	Speech emotion recognition systems require features to be extracted from the speech signal. These features include Time, Frequency, and Cepstral-domain features. To normalise features, it is a challenging task to select an appropriate normalisation algorithm since the algorithm may impact classification accuracy. This paper presents the effects of different normalisation methods applied to speech features for speech emotion recognition. Speech features are extracted from the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset and normalised before training machine and deep learning algorithms such as Logistic Regression, Support Vector Machine, Multilayer Perceptron, Convolutional Neural Network (CNN), and Long Short-Term Memory (LSTM). The CNNs and LSTMs obtained 72% for both accuracy and F1score outperforming standard machine learning algorithms. Feature normalisation improved both accuracy and F1score by more than 14% using CNN and LSTM.	en_US
dc.language.iso	en	en_US
dc.relation.ispartofseries	Workflow;23287
dc.subject	Machine learning	en_US
dc.subject	Neural networks	en_US
dc.subject	Emotion recognition	en_US
dc.subject	Normalisation method	en_US
dc.subject	Speech emotions	en_US
dc.title	The effects of normalisation methods on speech emotion recognition	en_US
dc.type	Conference Presentation	en_US
dc.identifier.apacitation	Sefara, T. J. (2019). The effects of normalisation methods on speech emotion recognition. http://hdl.handle.net/10204/11330	en_ZA
dc.identifier.chicagocitation	Sefara, Tshephisho J. "The effects of normalisation methods on speech emotion recognition." (2019): http://hdl.handle.net/10204/11330	en_ZA
dc.identifier.vancouvercitation	Sefara TJ, The effects of normalisation methods on speech emotion recognition; 2019. http://hdl.handle.net/10204/11330 .	en_ZA
dc.identifier.ris	TY - Conference Presentation AU - Sefara, Tshephisho J AB - Speech emotion recognition systems require features to be extracted from the speech signal. These features include Time, Frequency, and Cepstral-domain features. To normalise features, it is a challenging task to select an appropriate normalisation algorithm since the algorithm may impact classification accuracy. This paper presents the effects of different normalisation methods applied to speech features for speech emotion recognition. Speech features are extracted from the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset and normalised before training machine and deep learning algorithms such as Logistic Regression, Support Vector Machine, Multilayer Perceptron, Convolutional Neural Network (CNN), and Long Short-Term Memory (LSTM). The CNNs and LSTMs obtained 72% for both accuracy and F1score outperforming standard machine learning algorithms. Feature normalisation improved both accuracy and F1score by more than 14% using CNN and LSTM. DA - 2019-11 DB - ResearchSpace DP - CSIR KW - Machine learning KW - Neural networks KW - Emotion recognition KW - Normalisation method KW - Speech emotions LK - https://researchspace.csir.co.za PY - 2019 SM - 978-1-7281-0040-1 SM - 978-1-7281-0041-8 T1 - The effects of normalisation methods on speech emotion recognition TI - The effects of normalisation methods on speech emotion recognition UR - http://hdl.handle.net/10204/11330 ER -	en_ZA

Files in this item

Name: Sefara_2019.pdf

Size: 1.004Mb

Format: PDF

View/Open

This item appears in the following Collection(s)

Conference Publications

Show simple item record

Browse

All of ResearchSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects
- Publication Type
- Cluster
- Impact Area

Quick Links

Legislation and compliance

General Enquiries

Tel: + 27 12 841 2911
Email: callcentre@csir.co.za

Physical Address
Meiring Naudé Road
Brummeria
Pretoria
South Africa

Postal Address
PO Box 395
Pretoria 0001
South Africa

Social Connect

Resources on this site are free to download and reuse according to associated licensing provision. Please read the terms and conditions of usage of each resource.