Mismatches between application and training data greatly reduce the performance of automatic speech recognition (ASR) systems. However, collecting suitable amounts of in-domain and application-specific data for training is resource intensive and may not be feasible for resource-scarce environments. Utilising limited amounts of in-domain data and a combination of feature normalisation and acoustic model adaptation techniques has therefore found wide use in ASR systems. Various approaches have been proposed, and it is not clear when to make use of a particular approach given a specific amount of adaptation data. In this work we investigate the use of standard feature normalisation and model adaptation techniques, for the scenario where adaptation between narrow- and wide-band environments must be performed. Our investigation focuses on the dependence of the adaptation data amount and various adaptation techniques by systematically varying the adaptation data amount and com- paring the performance of various adaptation techniques. From this we establish a guideline which can be used by an ASR developer to choose the best adaptation technique given a size constraint on the adaptation data. In addition, we investigate the effectiveness of a novel channel normalisation technique and compare the performance with standard normalisation and adaptation techniques.
Reference:
Kleynhans, N and Barnard, E. 2013. Cross-bandwidth adaptation for ASR systems. In: Conference Proceedings of the 24th Annual Symposium of the Pattern Recognition Association of South Africa, Johannesburg, South Africa, 3 December 2013
Kleynhans, N., & Barnard, E. (2013). Cross-bandwidth adaptation for ASR systems. PRASA 2013 Proceedings. http://hdl.handle.net/10204/7271
Kleynhans, N, and E Barnard. "Cross-bandwidth adaptation for ASR systems." (2013): http://hdl.handle.net/10204/7271
Kleynhans N, Barnard E, Cross-bandwidth adaptation for ASR systems; PRASA 2013 Proceedings; 2013. http://hdl.handle.net/10204/7271 .
Conference Proceedings of the 24th Annual Symposium of the Pattern Recognition Association of South Africa, Johannesburg, South Africa, 3 December 2013