Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 08:39
09 Jul 2020

Speech intelligibility enhancement is a perceptual enhancement technique for clean speech reproduced in noisy environments. It is typically used in the listening stage of multimedia communications. In this study, we enhance speech intelligibility by speaking style conversion (SSC), which is a data-driven approach inspired by a vocal mechanism named Lombard effect. The proposed SSC method combines star generative adversarial network (StarGAN) based mapping and dynamic range compression (DRC). It has two main advantages: 1) different from gender-independent conversion in previous studies, StarGAN can separately learn speech features of different genders to provide a differential conversion among genders with a single model and non-parallel training data; 2) we design a multi-level enhancement strategy with the use of DRC in the StarGAN architecture, which improves the SSC performance in strong noise interference. Experiments show that our method outperforms baseline methods.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00