Skip to main content

Multi-speaker Speech Synthesis from Electromyographic Signals by Soft Speech Unit Prediction

Kevin Scheck (University of Bremen); Tanja Schultz (University of Bremen)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
06 Jun 2023

Electromyographic (EMG) signals of articulatory muscles reflect the speech production process even if the user is speaking silently i.e. moving the articulators without producing audible sound. We propose Speech-Unit-based EMG-to-Speech (SU-E2S), a system which relies on EMG to synthesize speech which contains the articulated content but is vocalized in another voice, determined by an acoustic reference utterance. It is based on a Voice Conversion (VC) system which decomposes acoustic speech into continuous soft speech units and a speaker embedding and then reconstructs acoustic features. SU-E2S performs speech synthesis by predicting soft speech units from EMG and using them as input to the VC system. Experiments show that the SU-E2S output is on par in terms of intelligibility of predicting acoustic features directly from EMG, but adds the functionality of synthesizing speech in other voices.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00