Hirotoshi Takeuchi(University of Tokyo), Kunio Kashino(NTT Corporation), Yasunori Ohishi(NTT Corporation) and Hiroshi Saruwatari(The University of Tokyo)
Abstract:
Convolutional neural networks have been successfully applied to a
variety of audio signal processing tasks including sound source
separation, speech recognition and acoustic scene understanding. Since
many pitched sounds have a harmonic structure, an operation, called
harmonic convolution, has been proposed to take advantages of the
structure appearing in the audio signals. However, the computational
cost involved is higher than that of normal convolution. This paper
proposes a faster calculation method of harmonic convolution called
Harmonic Lowering. The method unrolls the input data to a redundant
layout so that the normal convolution operation can be applied. The
analysis of the runtimes and the number of multiplication operations
show that the proposed method accelerates the harmonic convolution 2
to 7 times faster than the conventional method under realistic
parameter settings, while no approximation is introduced.