The INTERSPEECH 2020 Computational Paralinguistics ChallengE (ComParE)

Wed-SS-1-4-5 Phonetic, Frame Clustering and Intelligibility Analyses for the INTERSPEECH 2020 ComParE Challenge

Claude Montacié(Sorbonne University (STIH)) and Marie-José Caraty(Paris University (STIH))
Abstract: The INTERSPEECH 2020 Compare Mask Sub-Challenge is to determine whether a speech signal was emitted with or without wearing a surgical mask. For this purpose, we have investigated phonetic context and intelligibility measurements related to speech changes caused by wearing a mask. Experiments were conducted on the Mask Augsburg Speech Corpus (MASC) and on the Mask Sorbonne Speech Corpus (MSSC) both in German language. We investigated the effects of mask wearing on the acoustical properties of phonemes at frame and segment levels. At the frame level, a phonetic mask detector has been developed to determine the most sensitive phonemes when wearing a mask. At the segmental level, a perceptual scoring of intelligibility has been developed and assessed on the MSCC. Two mask detector systems have been developed and assessed on the MASC: the first one used two large composite audio feature sets, the second one used a bottom-up approach based on phonetic analysis and frame clustering. Experiments have shown an improvement of 5.9% (absolute) on the test set compared to the official baseline performance of the Challenge (71.8%).
Student Information

Student Events

Travel Grants