Speech Enhancement Using Self-Adaptation And Multi-Head Self-Attention

Yuma Koizumi, Kohei Yatabe, Marc Delcroix, Yoshiki Masuyama, Daiki Takeuchi

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 11:07

04 May 2020

This paper investigates a self-adaptation method for speech enhancement using auxiliary speaker-aware features; we extract a speaker representation used for adaptation directly from the test utterance. Conventional studies of deep neural network (DNN)--based speech enhancement mainly focus on building a speaker independent model. Meanwhile, in speech applications including speech recognition and synthesis, it is known that model adaptation to the target speaker improves the accuracy. Our research question is whether a DNN for speech enhancement can be adopted to unknown speakers without any auxiliary guidance signal in test-phase. To achieve this, we adopt multi-task learning of speech enhancement and speaker identification, and use the output of the final hidden layer of speaker identification branch as an auxiliary feature. In addition, we use multi-head self-attention for capturing long-term dependencies in the speech and noise. Experimental results on a public dataset show that our strategy achieves the state-of-the-art performance and also outperform conventional methods in terms of subjective quality.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Speech Enhancement Using Self-Adaptation And Multi-Head Self-Attention

Yuma Koizumi, Kohei Yatabe, Marc Delcroix, Yoshiki Masuyama, Daiki Takeuchi

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join an IEEE Society