Abner Hernandez(Seoul National University), Eun Jung Yeo(Seoul National University), Sunhee Kim(Seoul National University) and Minhwa Chung(Seoul National University)
Abstract:
Dysarthria refers to a range of speech disorders mainly affecting articulation. However, impairments are also seen in suprasegmental elements of speech such as prosody. In this study, we examine the effect of using rhythm metrics on detecting dysarthria, and for assessing severity level. Previous studies investigating prosodic irregularities in dysarthria tend to focus on pitch or voice quality measurements. Rhythm is another aspect of prosody which refers to the rhythmic division of speech units into relatively equal time. Speakers with dysarthria tend to have irregular rhythmic patterns that could be useful for detecting dysarthria. We compare the classification accuracy between solely using standard prosodic features against using both standard prosodic features and rhythm-based features, using random forest, support vector machine, and feed-forward neural networks. Our best performing classifiers achieved a relative percentage increase of 7.5% and 15% in detection and severity assessment, respectively, for the QoLT Korean dataset, while the TORGO English dataset had an increase of 4.1% and 3.2%. Results indicate that including rhythmic information can increase accuracy performance regardless of the classifier. Furthermore, we show that rhythm metrics are useful in both Korean and English.