The NCHLT speech corpus of the South African languages

Barnard, E; Davel, MH; Van Heerden, C; De Wet, Febe; Badenhorst, J

dc.contributor.author	Barnard, E
dc.contributor.author	Davel, MH
dc.contributor.author	Van Heerden, C
dc.contributor.author	De Wet, Febe
dc.contributor.author	Badenhorst, J
dc.date.accessioned	2014-07-30T09:25:09Z
dc.date.available	2014-07-30T09:25:09Z
dc.date.issued	2014-05
dc.identifier.citation	Barnard, E, Davel, M.H, Van Heerden, C, De Wet, F and Badenhorst, J. 2014. The NCHLT speech corpus of the South African languages. In: 4th International Workshop on Spoken Language Technologies for Under-Resourced Languages, St Petersburg, Russia, 14-16 May 2014	en_US
dc.identifier.isbn	978-5-8088-0908-6
dc.identifier.uri	http://mica.edu.vn/sltu2014/proceedings/28.pdf
dc.identifier.uri	http://hdl.handle.net/10204/7549
dc.description	4th International Workshop on Spoken Language Technologies for Under-Resourced Languages, St Petersburg, Russia, 14-16 May 2014	en_US
dc.description.abstract	The NCHLT speech corpus contains wide-band speech from approximately 200 speakers per language, in each of the eleven of cial languages of South Africa. We describe the design and development processes that were undertaken in order to develop the corpus, and report on associated materials such as orthographic transcriptions and pronunciation dictionaries that were released as part of the corpus. In order to benchmark speech recognition performance on the corpus, we have also developed both phone-recognition and word-recognition systems for all eleven languages; we nd that high accuracies can be achieved for these speaker-independent but vocabulary-dependent recognition tasks in all languages.	en_US
dc.language.iso	en	en_US
dc.relation.ispartofseries	Workflow;13145
dc.subject	Automatic Speech Recognition	en_US
dc.subject	ASR	en_US
dc.subject	Text-to speech	en_US
dc.subject	TTS	en_US
dc.subject	South African languages	en_US
dc.subject	Spoken language technologies	en_US
dc.subject	Under-resources languages	en_US
dc.title	The NCHLT speech corpus of the South African languages	en_US
dc.type	Conference Presentation	en_US
dc.identifier.apacitation	Barnard, E., Davel, M., Van Heerden, C., De Wet, F., & Badenhorst, J. (2014). The NCHLT speech corpus of the South African languages. http://hdl.handle.net/10204/7549	en_ZA
dc.identifier.chicagocitation	Barnard, E, MH Davel, C Van Heerden, Febe De Wet, and J Badenhorst. "The NCHLT speech corpus of the South African languages." (2014): http://hdl.handle.net/10204/7549	en_ZA
dc.identifier.vancouvercitation	Barnard E, Davel M, Van Heerden C, De Wet F, Badenhorst J, The NCHLT speech corpus of the South African languages; 2014. http://hdl.handle.net/10204/7549 .	en_ZA
dc.identifier.ris	TY - Conference Presentation AU - Barnard, E AU - Davel, MH AU - Van Heerden, C AU - De Wet, Febe AU - Badenhorst, J AB - The NCHLT speech corpus contains wide-band speech from approximately 200 speakers per language, in each of the eleven of cial languages of South Africa. We describe the design and development processes that were undertaken in order to develop the corpus, and report on associated materials such as orthographic transcriptions and pronunciation dictionaries that were released as part of the corpus. In order to benchmark speech recognition performance on the corpus, we have also developed both phone-recognition and word-recognition systems for all eleven languages; we nd that high accuracies can be achieved for these speaker-independent but vocabulary-dependent recognition tasks in all languages. DA - 2014-05 DB - ResearchSpace DP - CSIR KW - Automatic Speech Recognition KW - ASR KW - Text-to speech KW - TTS KW - South African languages KW - Spoken language technologies KW - Under-resources languages LK - https://researchspace.csir.co.za PY - 2014 SM - 978-5-8088-0908-6 T1 - The NCHLT speech corpus of the South African languages TI - The NCHLT speech corpus of the South African languages UR - http://hdl.handle.net/10204/7549 ER -	en_ZA

Files in this item

Name: De Wet_2014_ABSTRACT ...

Size: 6.684Kb

Format: PDF

View/Open

This item appears in the following Collection(s)

Conference Publications

Show simple item record

Browse

All of ResearchSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects
- Publication Type
- Cluster
- Impact Area

Quick Links

Legislation and compliance

General Enquiries

Tel: + 27 12 841 2911
Email: callcentre@csir.co.za

Physical Address
Meiring Naudé Road
Brummeria
Pretoria
South Africa

Postal Address
PO Box 395
Pretoria 0001
South Africa

Social Connect

Resources on this site are free to download and reuse according to associated licensing provision. Please read the terms and conditions of usage of each resource.