Thu-3-11-10 Sound Event Localization and Detection Based on Multiple DOA Beamforming and Multi-task Learning

Wei Xue(JD AI Research), Ying Tong(JD AI Research), Chao Zhang(JD AI Research), Guohong Ding(JD AI Research), Xiaodong He(JD AI Research) and Bowen Zhou(JD AI Research)
Abstract: The performance of sound event localization and detection (SELD) degrades in source-overlapping cases since features of different sources collapse with each other, and the network tends to fail to learn to separate these features effectively. In this paper, by leveraging the conventional microphone array signal processing to generate comprehensive representations for SELD, we propose a new SELD method based on multiple direction of arrival (DOA) beamforming and multi-task learning. By using multiple beamformers to extract the signals from different DOAs, the sound field is more diversely described, and specialised representations of target source and noises can be obtained. With labelled training data, the steering vector is estimated based on the cross-power spectra (CPS) and the signal presence probability (SPP), which eliminates the need of knowing the array geometry. We design two networks for sound event localization (SED) and sound source localization (SSL), and use a multi-task learning scheme for SED, in which the SSL-related task act as a regularization. We conduct the experiments using the database of DCASE2019 SELD task, and the results show that the proposed method can achieve the state-of-art performance.
Student Information

Student Events

Travel Grants