Utilizing Full Signal Reconstruction and Leveraging Perception for Deep Learning-Based Noisy Speech Enhancement

Dr. Donald Williamson

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 56:19

13 Mar 2023

Deep learning has helped make speech processing applications, such as speech recognition, speech synthesis and speech translation, more prevalent in everyday life. It has also advanced the field of speech enhancement, where it has become the state-of-the-art approach to removing unwanted sounds. Successful speech enhancement can have profound impacts on how people communicate with each other and through electronic devices, so it is an area of research that must be addressed. Although several deep learning approaches have been developed, which focus on novel architectures and optimization strategies, there remains other aspects of the problem that have not been adequately investigated.

In this talk, we will discuss how human perception should be leveraged to further address problems within speech enhancement, and we will discuss how human perception can be predicted and used to improve noise reduction. Additionally, we will provide highlights of my work in complex-domain speech enhancement, which encouraged full signal reconstruction.

Tags:

time-frequency analysis

speech

spectrogram

webinars

signal to noise ratio

Utilizing Full Signal Reconstruction and Leveraging Perception for Deep Learning-Based Noisy Speech Enhancement

Dr. Donald Williamson

More Like This

Short Course Bundle: ICASSP 2022 COURSE 5: Speech Technology for Health: From Technical Foundations to Applications (Parts 1-3)

Audio Signal Enhancement: A Weakly Supervised Deep Learning Approach

Tutorial: Advances in Objective Speech Intelligibility and Quality Assessment: From Psychoacoustics to Machine Learning

Join an IEEE Society