dc.contributor.author |
Sefara, Tshephisho J
|
|
dc.contributor.author |
Mokgonyane, TB
|
|
dc.date.accessioned |
2021-10-07T06:40:47Z |
|
dc.date.available |
2021-10-07T06:40:47Z |
|
dc.date.issued |
2021-08 |
|
dc.identifier.citation |
Sefara, T.J. & Mokgonyane, T. 2021. Gender identification in Sepedi speech corpus. http://hdl.handle.net/10204/12120 . |
en_ZA |
dc.identifier.isbn |
978-1-7281-8592-7 |
|
dc.identifier.isbn |
978-1-7281-8591-0 |
|
dc.identifier.isbn |
978-1-7281-8593-4 |
|
dc.identifier.uri |
DOI: 10.1109/icABCD51485.2021.9519308
|
|
dc.identifier.uri |
http://hdl.handle.net/10204/12120
|
|
dc.description.abstract |
Gender identification is the task of identifying the gender of the speaker from the audio signal. Most gender identification systems are developed using datasets belonging to well-resourced languages. There has been little focus on creating gender identification systems for under resourced African languages. This paper presents the development of a gender identification system using a Sepedi speech dataset containing a duration of 55.7 hours made of 30776 males and 28337 females. We build a gender identification system using machine learning models that are trained using multilayer Perceptron (MLP), convolutional neural network (CNN), and long short-term memory (LSTM). Mid-term features are extracted from time domain features, frequency domain features and cepstral domain features, and normalised using the Z-score normalisation technique. XGBoost is used as a feature selection method to select important features. MLP achieved the same F-score and an accuracy of 94% for data with seen speakers while LSTM and CNN achieved the same F-score and an accuracy of 97%. We further evaluated the models on data with unseen speakers. All the models achieved good performance in F-score and accuracy. |
en_US |
dc.format |
Fulltext |
en_US |
dc.language.iso |
en |
en_US |
dc.relation.uri |
https://ieeexplore.ieee.org/document/9519308/authors#authors |
en_US |
dc.source |
2021 International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems (icABCD), Durban, South Africa, 5-6 August 2021 |
en_US |
dc.subject |
Gender identification |
en_US |
dc.subject |
Convolutional neural network |
en_US |
dc.subject |
Sepedi |
en_US |
dc.subject |
XGBoost |
en_US |
dc.subject |
Feature selection |
en_US |
dc.subject |
Long short-term memory |
en_US |
dc.subject |
Multilayer Perceptron |
en_US |
dc.title |
Gender identification in Sepedi speech corpus |
en_US |
dc.type |
Conference Presentation |
en_US |
dc.description.pages |
6 |
en_US |
dc.description.note |
Paper delivered at the 2021 International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems (icABCD), Durban, South Africa, 5-6 August 2021. The attached pdf contains the accepted version of the published item. |
en_US |
dc.description.cluster |
Next Generation Enterprises & Institutions |
en_US |
dc.description.impactarea |
Data Science |
en_US |
dc.identifier.apacitation |
Sefara, T. J., & Mokgonyane, T. (2021). Gender identification in Sepedi speech corpus. http://hdl.handle.net/10204/12120 |
en_ZA |
dc.identifier.chicagocitation |
Sefara, Tshephisho J, and TB Mokgonyane. "Gender identification in Sepedi speech corpus." <i>2021 International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems (icABCD), Durban, South Africa, 5-6 August 2021</i> (2021): http://hdl.handle.net/10204/12120 |
en_ZA |
dc.identifier.vancouvercitation |
Sefara TJ, Mokgonyane T, Gender identification in Sepedi speech corpus; 2021. http://hdl.handle.net/10204/12120 . |
en_ZA |
dc.identifier.ris |
TY - Conference Presentation
AU - Sefara, Tshephisho J
AU - Mokgonyane, TB
AB - Gender identification is the task of identifying the gender of the speaker from the audio signal. Most gender identification systems are developed using datasets belonging to well-resourced languages. There has been little focus on creating gender identification systems for under resourced African languages. This paper presents the development of a gender identification system using a Sepedi speech dataset containing a duration of 55.7 hours made of 30776 males and 28337 females. We build a gender identification system using machine learning models that are trained using multilayer Perceptron (MLP), convolutional neural network (CNN), and long short-term memory (LSTM). Mid-term features are extracted from time domain features, frequency domain features and cepstral domain features, and normalised using the Z-score normalisation technique. XGBoost is used as a feature selection method to select important features. MLP achieved the same F-score and an accuracy of 94% for data with seen speakers while LSTM and CNN achieved the same F-score and an accuracy of 97%. We further evaluated the models on data with unseen speakers. All the models achieved good performance in F-score and accuracy.
DA - 2021-08
DB - ResearchSpace
DP - CSIR
J1 - 2021 International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems (icABCD), Durban, South Africa, 5-6 August 2021
KW - Gender identification
KW - Convolutional neural network
KW - Sepedi
KW - XGBoost
KW - Feature selection
KW - Long short-term memory
KW - Multilayer Perceptron
LK - https://researchspace.csir.co.za
PY - 2021
SM - 978-1-7281-8592-7
SM - 978-1-7281-8591-0
SM - 978-1-7281-8593-4
T1 - Gender identification in Sepedi speech corpus
TI - Gender identification in Sepedi speech corpus
UR - http://hdl.handle.net/10204/12120
ER -
|
en_ZA |
dc.identifier.worklist |
24961 |
en_US |