dc.contributor.author |
Modipa, T
|
|
dc.contributor.author |
Davel, MH
|
|
dc.date.accessioned |
2010-12-23T10:00:21Z |
|
dc.date.available |
2010-12-23T10:00:21Z |
|
dc.date.issued |
2010-11 |
|
dc.identifier.citation |
Modipa, T and Davel, MH. 2010. Pronunciation modelling of foreign words for Sepedi ASR. 21st Annual Symposium of the Pattern Recognition Association of South Africa (PRASA), Stellenbosch, South Africa, 22-23 November 2010, pp 185-189 |
en |
dc.identifier.isbn |
978-0-7992-2470-2 |
|
dc.identifier.uri |
http://hdl.handle.net/10204/4715
|
|
dc.description |
21st Annual Symposium of the Pattern Recognition Association of South Africa (PRASA), Stellenbosch, South Africa, 22-23 November 2010 |
en |
dc.description.abstract |
This study focuses on the effective pronunciation modelling of words from different languages encountered during the development of a Sepedi automatic speech recognition (ASR) system. While the speech corpus used for training the ASR system consists mostly of Sepedi utterances, many words from English (and other South African languages) are embedded within the Sepedi sentences. In order to model these words effectively, different approaches to pronunciation dictionary development are investigated, specifically: (1) using language-specific letter-to-sound rules to predict the pronunciation of each word (based on the language of the word) and mapping foreign phonemes to Sepedi phonemes using linguistically motivated mappings, (2) experimenting with data-driven foreign-to-Sepedi phonemes using linguistically motivated mappings, and (3) using Sepedi letter-to-sound to predict the pronunciation of all words irrespective of language. We find that the data-driven phoneme mappings are more accurate than the initial linguistically motivated mappings evaluated, and (with a slight margin) obtain our best result using Sepedi letter-to-sound rules across all words in the speech corpus. |
en |
dc.language.iso |
en |
en |
dc.publisher |
PRASA 2010 |
en |
dc.relation.ispartofseries |
Conference Paper |
en |
dc.subject |
Sepedi |
en |
dc.subject |
Automatic speech recognition |
en |
dc.subject |
Pronunciation modelling |
en |
dc.subject |
Pattern recognition |
en |
dc.subject |
PRASA 2010 |
en |
dc.title |
Pronunciation modelling of foreign words for Sepedi ASR |
en |
dc.type |
Conference Presentation |
en |
dc.identifier.apacitation |
Modipa, T., & Davel, M. (2010). Pronunciation modelling of foreign words for Sepedi ASR. PRASA 2010. http://hdl.handle.net/10204/4715 |
en_ZA |
dc.identifier.chicagocitation |
Modipa, T, and MH Davel. "Pronunciation modelling of foreign words for Sepedi ASR." (2010): http://hdl.handle.net/10204/4715 |
en_ZA |
dc.identifier.vancouvercitation |
Modipa T, Davel M, Pronunciation modelling of foreign words for Sepedi ASR; PRASA 2010; 2010. http://hdl.handle.net/10204/4715 . |
en_ZA |
dc.identifier.ris |
TY - Conference Presentation
AU - Modipa, T
AU - Davel, MH
AB - This study focuses on the effective pronunciation modelling of words from different languages encountered during the development of a Sepedi automatic speech recognition (ASR) system. While the speech corpus used for training the ASR system consists mostly of Sepedi utterances, many words from English (and other South African languages) are embedded within the Sepedi sentences. In order to model these words effectively, different approaches to pronunciation dictionary development are investigated, specifically: (1) using language-specific letter-to-sound rules to predict the pronunciation of each word (based on the language of the word) and mapping foreign phonemes to Sepedi phonemes using linguistically motivated mappings, (2) experimenting with data-driven foreign-to-Sepedi phonemes using linguistically motivated mappings, and (3) using Sepedi letter-to-sound to predict the pronunciation of all words irrespective of language. We find that the data-driven phoneme mappings are more accurate than the initial linguistically motivated mappings evaluated, and (with a slight margin) obtain our best result using Sepedi letter-to-sound rules across all words in the speech corpus.
DA - 2010-11
DB - ResearchSpace
DP - CSIR
KW - Sepedi
KW - Automatic speech recognition
KW - Pronunciation modelling
KW - Pattern recognition
KW - PRASA 2010
LK - https://researchspace.csir.co.za
PY - 2010
SM - 978-0-7992-2470-2
T1 - Pronunciation modelling of foreign words for Sepedi ASR
TI - Pronunciation modelling of foreign words for Sepedi ASR
UR - http://hdl.handle.net/10204/4715
ER -
|
en_ZA |