Mon-3-10-3 Speaker conditioned acoustic-to-articulatory inversion using x-vectors

Aravind Illa(PhD Student, Indian Institute of Science, Bangalore) and Prasanta Ghosh(Assistant Professor, EE, IISc)

Abstract: Speech production involves the movement of various articulators, including tongue, jaw, and lips. Estimating the movement of the articulators from the acoustics of speech is known as acoustic-to-articulatory inversion (AAI). Recently, it has been shown that instead of training AAI in a speaker specific manner, pooling the acoustic-articulatory data from multiple speakers is beneficial. Further, additional conditioning with speaker specific information by one-hot encoding at the input of AAI along with acoustic features benefits the AAI performance in a closed-set speaker train and test condition. In this work, we carry out an experimental study on the benefit of using x-vectors for providing speaker specific information to condition AAI. Experiments with 30 speakers have shown that the AAI performance benefits from the use of x-vectors in a closed set seen speaker condition. Further, x-vectors also generalizes well for unseen speaker evaluation.

Paper

prev Mon-3-10-2 Improve the performance of acoustic-to-articulatory inversion by dynamically removing the training loss of noncritical portions of articulatory channels

next Mon-3-10-4 Coarticulation as synchronised sequential target approximation: An EMA study

About

About the Conference

Welcome from the Chair

Conference Committees

Calls