Chenggang Zhang(Computer Science Department, Inner Mongolian University) and Xueliang Zhang(Computer Science Department, Inner Mongolian University)
Acoustic echo cancellation (AEC) is used to cancel feedback between a loudspeaker and a microphone. Ideally, AEC is a linear problem and can be solved by adaptive filtering. However, in practice, two important problems severely affect the performance of AEC, i.e. 1) double-talk problem and 2) nonlinear distortion mainly caused by loudspeakers and/or power amplifiers. Considering these two problems in AEC, we propose a novel cascaded AEC which integrates adaptive filtering and deep learning. Specifically, two long short-term memory networks (LSTM) are employed for double-talk detection (DTD) and nonlinearity modeling, respectively. The adaptive filtering is employed to remove the linear part of echo. Experimental results show that the proposed method outperforms conventional methods in terms of the objective evaluation metrics by a considerable margin in the matched scenario. Moreover, the proposed method has much better generalization ability in the unmatched scenarios, compared with end-to-end deep learning method.