Diphone-based speech synthesis has been researched for many years, and some of the most successful concatenative synthesis systems employ diphones. Such systems can produce very intelligible synthetic speech, but tend not to sound completely natural. This lack of naturalness can be attributed, at least in part, to the limited set of units from which speech ischosen, coupled with the need to prosodically modify the speech signal of each diphone. Diphone Backoff mechanisms in text-to-speech provide a means of ensuring that synthesis of the text takes place, even if some of the diphones in the text are missing in the speech database. This paper describes an automatic method for synthetically creating missing diphones from halfphones that are in the speech database.
Reference:
Louw, JA and Davel, M. 2006. Halfphones: a backoff mechanism for Diphone Unit Selection Synthesis. 17th Annual Symposium of the Pattern Recognition Association of South Africa, Parys, South Africa, 29 Nov - 1 Dec 2006, pp 4
Louw, J. A., & Davel, M. (2006). Halfphones: a backoff mechanism for Diphone Unit Selection Synthesis. http://hdl.handle.net/10204/1041
Louw, Johannes A, and M Davel. "Halfphones: a backoff mechanism for Diphone Unit Selection Synthesis." (2006): http://hdl.handle.net/10204/1041
Louw JA, Davel M, Halfphones: a backoff mechanism for Diphone Unit Selection Synthesis; 2006. http://hdl.handle.net/10204/1041 .