Tue-1-3-2 ASR Error Correction with Augmented Transformer for Entity Retrieval

Haoyu Wang(Amazon), Shuyan Dong(Amazon), Yue Liu(Amazon), James Logan(Amazon), Ashish Kumar Agrawal(Amazon) and Yang Liu(Amazon)
Abstract: Domain-agnostic Automatic Speech Recognition (ASR) systems suffer from the issue of mistranscribing domain-specific words, which leads to failures in downstream tasks. In this paper, we present a post-editing ASR error correction method using the Transformer model for entity mention correction and retrieval. Specifically, we propose a novel augmented variant of the Transformer model that encodes both the word and phoneme sequence of an entity, and attends to phoneme information in addition to word-level information during decoding to correct mistranscribed named entities. We evaluate our method on both the ASR error correction task and the downstream retrieval task. Our method achieves 48.08% entity error rate (EER) reduction in ASR error correction task and 26.74% mean reciprocal rank (MRR) improvement for the retrieval task. In addition, our augmented Transformer model significantly outperforms the vanilla Transformer model with 17.89% EER reduction and 1.98% MRR increase, demonstrating the effectiveness of incorporating phoneme information in the correction model.
Student Information

Student Events

Travel Grants