MINIMIZING RESIDUALS FOR NATIVE-NONNATIVE VOICE CONVERSION IN A SPARSE, ANCHOR-BASED REPRESENTATION OF SPEECH

Christopher Liberatore, Ricardo Gutierrez-Osuna

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:07:28

10 May 2022

We present a dictionary-learning algorithm for reducing the sparse coding residual of an exemplar-based method for native-to- nonnative voice conversion (VC). The proposed algorithm iteratively updates the source and target speaker dictionaries to reduce both the residual and voice conversion error, thereby increasing synthesis quality. We evaluate the method on speech from the ARCTIC and L2-ARCTIC corpora and compare it to a baseline exemplar-based VC algorithm. The proposed algorithm significantly improves synthesis quality to more than double that of the baseline system while using two orders of magnitude fewer atoms. Additionally, the proposed algorithm significantly reduces both the VC error and the residual magnitude. We discuss the implications of the algorithm for broad exemplar-based VC systems.

Tags:

residual

dictionary learning

exemplar voice conversion

sparse coding

voice conversion

MINIMIZING RESIDUALS FOR NATIVE-NONNATIVE VOICE CONVERSION IN A SPARSE, ANCHOR-BASED REPRESENTATION OF SPEECH

Christopher Liberatore, Ricardo Gutierrez-Osuna

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Short Course Bundle: ICASSP 2022 COURSE 5: Speech Technology for Health: From Technical Foundations to Applications (Parts 1-3)

A MULTISCALE RESIDUAL SOLVER FOR TOTAL VARIATION MODELS

RADAR HRRP UNSEEN CLASS RECOGNITION BASED ON THE JOINT DICTIONARY LEARNING

Join an IEEE Society