Cross/Multi-Lingual and Code-Switched Speech Recognition

Mon-3-1-5 Multilingual Acoustic and Language Modeling for Ethio-Semitic Languages

Solomon Teferra Abate(Addis Ababa University), Martha Yifiru Tachbelie(Addis Ababa University) and Tanja Schultz(Universität Bremen)
Abstract: Development of Multilingual Automatic Speech Recognition (ASR) systems enables to share existing speech and text corpora among languages. We have conducted experiments on the development of multilingual Acoustic Models (AM) and Language Models (LM) for Tigrigna. Using Amharic Deep Neural Network (DNN) AM, Tigrigna pronunciation dictionary and trigram LM, we achieved a Word Error Rate (WER) of 30.9% for Tigrigna. Adding training speech from the target language (Tigrigna) to the whole training speech of the donor language (Amharic) continuously reduces WER with the amount of added data. We have also developed different (including recurrent neural networks based) multilingual LMs and achieved a relative WER reduction of 3.56% compared to the use of monolingual trigram LMs. Considering scarcity of computational resources to decode with very large vocabularies, we have also experimented on the use of morphemes as pronunciation and language modeling units. We have achieved character error rate (CER) of 7.9% which is relatively lower by 38.3% to 1.3% than the CER of the word-based models of smaller vocabularies than 162k. Our results show the possibility of developing ASR system for an Ethio-Semitic language using an existing speech and text corpora of another language in the family.
Student Information

Student Events

Travel Grants