Thu-3-6-3 Staged Knowledge Distillation for End-to-End Dysarthric Speech Recognition and Speech Attribute Transcription

Yuqin Lin(Tianjin University), Longbiao Wang(Tianjin University), Sheng Li(National Institute of Information and Communications Technology (NICT), Advanced Speech Technology Laboratory), Jianwu Dang(JAIST) and Chenchen Ding(NICT)

Abstract: This study proposes a staged knowledge distillation method to build End-to-End (E2E) automatic speech recognition (ASR) and automatic speech attribute transcription (ASAT) systems for patients with dysarthria caused by either cerebral palsy (CP) or amyotrophic lateral sclerosis (ALS). Compared with traditional methods, our proposed method can use limited dysarthric speech more effectively. And the dysarthric E2E-ASR and ASAT systems enhanced by the proposed method can achieve 38.28% relative phone error rate (PER%) reduction and 48.33% relative attribute detection error rate (DER%) reduction over their baselines respectively on the TORGO dataset. The experiments show that our system offers potential as a rehabilitation tool and medical diagnostic aid.

Paper

prev Thu-3-6-2 Automatic Assessment of Dysarthric Severity Level Using Audio-Video Cross-Modal Approach in Deep Learning

next Thu-3-6-4 Dysarthric Speech Recognition Based on Deep Metric Learning

About

About the Conference

Welcome from the Chair

Conference Committees

Calls