Skip to main content

FRE-GAN 2: FAST AND EFFICIENT FREQUENCY-CONSISTENT AUDIO SYNTHESIS

Sang-Hoon Lee, Ji-Hoon Kim, Kang-Eun Lee, Seong-Whan Lee

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 00:05:16
08 May 2022

Although recent advances in neural vocoder have shown significant improvement, most of these models have a trade-off between audio quality and computational complexity. Since the large model has a limitation on the low-resource devices, a more efficient neural vocoder should synthesize high-quality audio for practical applicability. In this paper, we present Fre-GAN 2, a fast and efficient high-quality audio synthesis model. For fast synthesis, Fre-GAN 2 only synthesizes low and high-frequency parts of the audio, and we leverage the inverse discrete wavelet transform to reproduce the target-resolution audio in the generator. Additionally, we also introduce adversarial periodic feature distillation, which makes the model synthesize high-quality audio with only a small parameter. The experimental results show the superiority of Fre-GAN 2 in audio quality. Furthermore, Fre-GAN 2 has a 10.91?generation acceleration, and the parameters are compressed by 21.23? than Fre-GAN.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00