Wed-SS-2-3-2 Speaker Diarization System based on DPCA Algorithm For Fearless Steps Challenge Phase-2

XueShuai Zhang(Institute of Acoustics, Chinese Academy of Sciences), Wenchao Wang(Institute of Acoustics, Chinese Academy of Sciences) and pengyuan zhang(Institute of Acoustics, Chinese Academy of Sciences)

Abstract: This paper describes the ASRGroup team speaker diarization systems submitted to the TRACK 2 of the Fearless Steps Challenge Phase-2. In this system, the similarity matrix among all segments of an audio recording was measured by Sequential Bidirectional Long Short-term Memory Networks (Bi-LSTM), and a clustering scheme based on Density Peak Cluster Algorithm (DPCA) was proposed to clustering the segments. The system was compared with the Kaldi Toolkit diarization system (x-vector based on TDNN with PLDA scoring model) and the Spectral system (similarity based on Bi-LSTM with Spectral clustering algorithm). Experiments show that our system is significantly outperforms above systems and achieves a Diarization Error Rate (DER) of 42.75% and 39.52% respectively on the Dev dataset and Eval dataset of TRACK 2 (Fearless Steps Challenge Phase-2). Compared with the baseline Kaldi Toolkit diarization system and Spectral Clustering algorithm with Bi-LSTM similarity models, the DER of our system is absolutely reduced 4.64%, 1.84% and 8.85%, 7.57% respectively on the Dev dataset and Eval dataset.

Paper

prev Wed-SS-2-3-1 Statistical and Neural Network Based Speech Activity Detection in Non-Stationary Acoustic Environments

next Wed-SS-2-3-3 The DKU Speech Activity Detection and Speaker Identification Systems for Fearless Steps Challenge Phase-02

About

About the Conference

Welcome from the Chair

Conference Committees

Calls