Melglow: Efficient Waveform Generative Network Based On Location-Variable Convolution

Zhen Zeng, Jianzong Wang, Ning Cheng, Jing Xiao

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 0:12:37

19 Jan 2021

Recent neural vocoders usually use a wavenet-like network to capture the long-term dependencies of the waveform, but a large number of parameters are required to obtain good modeling capabilities. In this paper, an efficient network, named location-variable convolution, is proposed to model the dependencies of waveform. Different from the use of unified convolution kernels in WaveNet to capture the dependencies of arbitrary waveforms, the location-variable convolution utilizes a kernel predictor to generate multiple sets of convolutions kernel based on the mel-spectrum, where each set of convolution kernel is used to perform convolution operation on the associated intervals of waveform. Combining with WaveGlow and the location-variable convolution, an efficient vocoder, named as MelGlow, is designed. Experiments on the LJSpeech dataset show that MelGlow achieves better performance than WaveGlow at small model sizes, which verifies the effectiveness and potential optimization space of the location-variable convolution.

Tags:

sps conference

slt 2021

Melglow: Efficient Waveform Generative Network Based On Location-Variable Convolution

Zhen Zeng, Jianzong Wang, Ning Cheng, Jing Xiao

Value-Added Bundle(s) Including this Product

SLT 2021 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join an IEEE Society