Wed-SS-2-3-3 The DKU Speech Activity Detection and Speaker Identification Systems for Fearless Steps Challenge Phase-02

Qingjian Lin(SEIT, Sun Yat-sen University), Tingle Li(Duke Kunshan University) and Ming Li(Duke Kunshan University)

Abstract: This paper describes the systems developed by the DKU team for the Fearless Steps Challenge Phase-02 competition. For the Speech Activity Detection task, we start with the Long Short- Term Memory (LSTM) system and then apply the ResNet- LSTM improvement. Our ResNet-LSTM system reduces the DCF error by about 38% relatively in comparison with the LSTM baseline. We also discuss the system performance with additional training corpora included, and the lowest DCF of 1.406% on the Eval Set is gained with system pre-training. As for the Speaker Identification task, we employ the Deep ResNet vector system, which receives a variable-length feature sequence and directly generates speaker posteriors. The pretraining process with Voxceleb is also considered, and our bestperforming system achieves the Top-5 accuracy of 92.393% on the Eval Set.

Paper

prev Wed-SS-2-3-2 Speaker Diarization System based on DPCA Algorithm For Fearless Steps Challenge Phase-2

next Wed-SS-2-3-4 "This is Houston. Say again, please''. The Behavox system for the Apollo-11 Fearless Steps Challenge (phase II).

About

About the Conference

Welcome from the Chair

Conference Committees

Calls