Wed-3-9-4 Iterative Compression of End-to-End ASR Model using AutoML

Abhinav Mehrotra(Samsung AI Center), Lukasz Dudziak(Samsung AI Center), Jinsu Yeo(Samsung), Young-yoon Lee(Samsung), Ravichander Vipperla(Samsung AI Centre), Mohamed Abdelfattah(Samsung AI Center), Sourav Bhattacharya(Samsung AI Center), Samin Ishtiaq(Samsung AI Center), Alberto Gil C. P. Ramos(Samsung AI Center), SangJeong Lee(Samsung), Daehyun Daehyun Kim(Samsung) and Nic Lane(Samsung AI Center)
Abstract: Increasing demand for on-device Automatic Speech Recognition (ASR) systems has resulted in renewed interests in developing automatic model compression techniques. Past research have shown that AutoML-based Low Rank Factorization (LRF) technique, when applied to an end-to-end Encoder-Attention-Decoder style ASR model, can achieve a speedup of up to 3.7x, outperforming laborious manual rank-selection approaches. However, we show that current AutoML-based search techniques only work up to a certain compression level, beyond which they fail to produce compressed models with acceptable word error rates (WER). In this work, we propose an iterative AutoML-based LRF approach that achieves over 5x compression without degrading the WER, thereby advancing the state-of-the-art in ASR compression.
Student Information

Student Events

Travel Grants