MXit lingo is an abbreviated form of written English used by children, teenagers and young adults when communicating using MXit as a medium over cell phones. A stemmer for MXit lingo would enable a search engine such as Lucene to index stored MXit conversations for later searching. A MXit stemmer would have to cater for the new grammatical and linguistic conventions which have developed in MXit lingo. For example, a word which contains a trailing -er may have the -er changed to an -a. Thus the word “ova” can be used in place of “over” and “unda” can be used in place of “under”. This paper describes the creation of a Lucene stemmer for MXit lingo. It also itemizes the conventions which have been noted in MXit lingo.
Reference:
Butgereit, LL and Botha, RA. 2011. Lucene stemmer for MXit lingo. The 13th Annual Conference on World Wide Web Applications, Johannesburg, 14-16 September 2011
Butgereit, L., & Botha, R. (2011). Lucene stemmer for MXit lingo. Cape Peninsula University of Technology. http://hdl.handle.net/10204/5329
Butgereit, LL, and RA Botha. "Lucene stemmer for MXit lingo." (2011): http://hdl.handle.net/10204/5329
Butgereit L, Botha R, Lucene stemmer for MXit lingo; Cape Peninsula University of Technology; 2011. http://hdl.handle.net/10204/5329 .