Tue-1-8-6 Hierarchical Multi-Stage Word-to-Grapheme Named Entity Corrector for Automatic Speech Recognition

Abhinav Garg(Samsung Electronics), Ashutosh Gupta(Samsung Electronics), Dhananjaya Gowda(Samsung Research), Shatrughan Singh(Samsung Research) and Chanwoo Kim(Samsung Research)
Abstract: In this paper, we propose a hierarchical multi-stage word-to-grapheme Named Entity Correction (NEC) algorithm.Conventional NEC algorithms use a single-stage grapheme or phoneme level edit distance to search and replace Named Entities (NEs) misrecognized by a speech recognizer. However, longer named entities like song titles cannot be easily handled by such a single stage correction. We propose a three-stage NEC, starting with a word-level matching, followed by a phonetic double meta-phone based matching, and a final grapheme level candidate selection. We also propose a novel NE Rejection mechanism which is important to ensure that the NEC does not replace correctly recognized NEs with unintended but similar named entities. We evaluate our solution on two different test sets from the call and music domains, for both server as well as on-device speech recognition configurations. For the on-device model, our NEC outperforms an n-gram fusion when employed standalone. Our NEC reduces the word error rate by 14% and 63% relatively for music and call, respectively, when used after an n-gram based biasing language model. The average latency of our NEC is under 3 ms per input sentence while using only around 1 MB for an input NE list of 20,000 entries.
Student Information

Student Events

Travel Grants