Thu-3-1-4 Domain Adaptation for Enhancing Speech-based Depression Detection in Natural Environmental Conditions Using Dilated CNNs

Zhaocheng Huang(School of Electrical Engineering and Telecommunications, UNSW Australia), Julien Epps(School of Electrical Engineering and Telecommunications, UNSW Australia), Dale Joachim(Sonde Health), Brian Stasak(University of New South Wales), James Williamson(MIT Lincoln Laboratory) and Thomas Quatieri(MIT Lincoln Laboratory)
Abstract: Depression disorders are a major growing concern worldwide, especially given the unmet need for widely deployable depression screening for use in real-world environments. Automatic speech-based depression screening technologies have shown promising results, but primarily in systems that are trained using laboratory-based recorded speech. They do not generalize well on data from more naturalistic settings. This paper addresses the generalizability issue by proposing multiple adaptation strategies that update pre-trained models based on a dilated convolutional neural network (CNN) framework, which improve depression detection performance in both clean and naturalistic environments. Experimental results on two depression corpora show that feature representations in CNN layers need to be adapted to accommodate environmental changes, and that increases in data quantity and quality are helpful for pre-training models for adaptation. The cross-corpus adapted systems produce relative improvements of 29.4% and 17.2% in unweighted average recall over non-adapted systems for both clean and naturalistic corpora, respectively.
Student Information

Student Events

Travel Grants