Tue-1-8-5 Evolved Speech Transformer: Applying Neural Architecture Search to End-to-End Automatic Speech Recognition

Jihwan Kim(VUNO), Jisung Wang(VUNO), Sangki Kim(VUNO) and Yeha Lee(VUNO)

Abstract: Neural architecture search (NAS) has been successfully applied to finding efficient, high-performance deep neural network architectures in a task-adaptive manner without extensive human intervention. This is achieved by turning to genetic, reinforcement learning, or gradient -based algorithms as automative alternatives to manual architecture design. However, a naive application of existing NAS algorithms to different tasks may find architectures which perform sub-par to those manually designed. In this work, we show that NAS can find efficient architectures which also outperform manually designed architectures on speech recognition tasks, which we name Evolved Speech Transformation (EST). With a combination of carefully designed search space and progressive dynamic hurdles, a genetic algorithm based on survival of the fittest, our algorithm finds a memory-efficient architecture which outperforms Transformer with reduced training time.

Paper

prev Tue-1-8-4 Combination of end-to-end and hybrid models for speech recognition

next Tue-1-8-6 Hierarchical Multi-Stage Word-to-Grapheme Named Entity Corrector for Automatic Speech Recognition

About

About the Conference

Welcome from the Chair

Conference Committees

Calls