Mon-3-9-2 iMetricGAN: Intelligibility Enhancement for Speech-in-Noise using Generative Adversarial Network-based Metric Learning

Haoyu Li(National Institute of Informatics), Szu-wei Fu(Research Center for Information Technology Innovation, Academia Sinica), Yu Tsao(Academia Sinica) and Junichi Yamagishi(National Institute of Informatics)

Abstract: The intelligibility of natural speech is seriously degraded when exposed to adverse noisy environments. In this work, we propose a deep learning-based speech modification method to compensate for the intelligibility loss, with the constraint that the root mean square (RMS) level and duration of the speech signal are maintained before and after modifications. Specifically, we utilize an iMetricGAN approach to optimize the speech intelligibility metrics with generative adversarial networks (GANs). Experimental results show that the proposed iMetricGAN outperforms conventional state-of-the-art algorithms in terms of objective measures, i.e., speech intelligibility in bits (SIIB) and extended short-time objective intelligibility (ESTOI), under a Cafeteria noise condition. In addition, formal listening tests reveal significant intelligibility gains when both noise and reverberation exist.

prev Mon-3-9-1 Optimization and evaluation of an intelligibility-improving signal processing approach (IISPA) for the Hurricane Challenge 2.0 with FADE

next Mon-3-9-3 Intelligibility-enhancing speech modifications – The Hurricane Challenge 2.0

About

About the Conference

Welcome from the Chair

Conference Committees

Calls