Wed-3-7-8 Efficient Low-Latency Speech Enhancement with Mobile Audio Streaming Networks

Michał Romaniuk(Samsung R&D Institute Poland), Piotr Masztalski(Samsung R&D Institute Poland), Karol Piaskowski(Samsung R&D Institute Poland) and Mateusz Matuszewski(Samsung R&D Institute Poland)
Abstract: We propose Mobile Audio Streaming Networks (MASnet) for efficient low-latency speech enhancement, which is particularly suitable for mobile devices and other applications where computational capacity is a limitation. MASnet processes linear-scale spectrograms, transforming successive noisy frames into complex-valued ratio masks which are then applied to the respective noisy frames. MASnet can operate in a low-latency incremental inference mode which matches the complexity of layer-by-layer batch mode. Compared to a similar fully-convolutional architecture, MASnet incorporates depthwise and pointwise convolutions for a large reduction in fused multiply-accumulate operations per second (FMA/s), at the cost of some reduction in SNR.
Student Information

Student Events

Travel Grants