Yuqin Lin(Tianjin University), Longbiao Wang(Tianjin University), Sheng Li(National Institute of Information and Communications Technology (NICT), Advanced Speech Technology Laboratory), Jianwu Dang(JAIST) and Chenchen Ding(NICT)
This study proposes a staged knowledge distillation method to build End-to-End (E2E) automatic speech recognition (ASR) and automatic speech attribute transcription (ASAT) systems for patients with dysarthria caused by either cerebral palsy (CP) or amyotrophic lateral sclerosis (ALS). Compared with traditional methods, our proposed method can use limited dysarthric speech more effectively. And the dysarthric E2E-ASR and ASAT systems enhanced by the proposed method can achieve 38.28% relative phone error rate (PER%) reduction and 48.33% relative attribute detection error rate (DER%) reduction over their baselines respectively on the TORGO dataset. The experiments show that our system offers potential as a rehabilitation tool and medical diagnostic aid.