Tue-1-2-9 Blind speech signal quality estimation for speaker verification systems

Galina Lavrentyeva(ITMO University, STC-innovations), Marina Volkova(ITMO University, STC-innovations Ltd.), Anastasia Avdeeva(STC-innovations Ltd.), Sergey Novoselov(ITMO University, Speech Technology Center), Artem Gorlanov(STC-innovations Ltd.), Tseren Andzukaev(STC-innovations Ltd.), Artem Ivanov(STC-innovations Ltd.) and Alexandr Kozlov(Speech Technology Center Ltd.)

Abstract: The problem of system performance degradation in mismatched acoustic conditions has been widely acknowledged in the community and is common for different fields. The present state-of-the-art deep speaker embedding models are domain-sensitive. The main idea of the current research is to develop a single method for automatic signal quality estimation, which allows to evaluate short-term signal characteristics. This paper presents a neural network based approach for blind speech signal quality estimation in terms of signal-to-noise ratio (SNR) and reverberation time (RT60), which is able to classify the type of underlying additive noise. Additionally, current research revealed the need for an accurate voice activity detector that performs well in both clean and noisy unseen environments. Therefore a novel neural network VAD based on U-net architecture is presented. The proposed algorithms allow to perform the analysis of NIST, SITW, Voices datasets commonly used for objective comparison of speaker verification systems from the new point of view and consider effective calibration steps to improve speaker recognition quality on them.

Paper

prev Tue-1-2-8 Speaker Re-identification with Speaker Dependent Speech Enhancement

next Tue-1-2-10 Investigating Robustness of Adversarial Samples Detection for Automatic Speaker Verification

About

About the Conference

Welcome from the Chair

Conference Committees

Calls