A Sequential Contrastive Learning Framework For Robust Dysarthric Speech Recognition

Lidan Wu, Daoming Zong, Jing Zhao, Shiliang Sun

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:07:20

11 Jun 2021

Dysarthria is a manifestation of the disruption in the neuro-muscular physiology resulting in uneven, slow, slurred, harsh, or quiet speech. Despite the remarkable progress of automatic speech recognition (ASR), it poses great challenges in developing stable ASR for dysarthric individuals due to the high intra- and inter-speaker variations and data deficiency. In this paper, we propose a contrastive learning framework for robust dysarthric speech recognition (DSR) by capturing the dysarthric speech variability. Several speech data augmentation strategies are explored to form two branches of the framework, meanwhile alleviating the scarcity of dysarthria data. We also develop an efficient projection head acting on a sequence of learned hidden representations for defining contrastive loss. Experiment results on DSR demonstrate that the model is better than or comparable to the supervised baseline.

Chairs:

Paavo Alku

Tags:

signal processing society

IEEE icassp 2021

virtual conference

2021

sps

virtual conference icassp 2021

june 6-11 2021

icassp 2021

A Sequential Contrastive Learning Framework For Robust Dysarthric Speech Recognition

Lidan Wu, Daoming Zong, Jing Zhao, Shiliang Sun

Value-Added Bundle(s) Including this Product

ICASSP 2021 Virtual Conference - Presentation Videos Product Bundle

More Like This

Anomaly Detection Via Context And Local Feature Matching

Brain Tumor Sequence Registration Challenge (Brats-Reg): Establishing Correspondence Between Pre-Operative And Follow-Up MRI

Generation Of 12-Lead Electrocardiogram With SubjeCT-Specific, Image-Derived Characteristics Using A Conditional Variational Autoencoder

Join an IEEE Society