DEEP PERFORMER: SCORE-TO-AUDIO MUSIC PERFORMANCE SYNTHESIS

Hao-Wen Dong, Taylor Berg-Kirkpatrick, Julian McAuley, Cong Zhou

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:15:36

13 May 2022

Music performance synthesis aims to synthesize a musical score into a natural performance. In this paper, we borrow recent advances in text-to-speech synthesis and present the Deep Performer?a novel system for score-to-audio music performance synthesis. Unlike speech, music often contains polyphony and long notes. Hence, we propose two new techniques for handling polyphonic inputs and providing a fine-grained conditioning in a transformer encoder-decoder model. To train the proposed system, we present a new violin dataset consisting of paired recordings and scores along with estimated alignments between them. We show that our proposed model can synthesize music with clear polyphony and harmonic structures. In a listening test, we achieve competitive quality against the baseline model, a conditional generative audio model, in terms of pitch accuracy, timbre and noise level. Moreover, our proposed model significantly outperforms the baseline on an existing piano dataset in overall quality.

Tags:

music information retrieval

computer music

audio synthesis

machine learning

neural network

DEEP PERFORMER: SCORE-TO-AUDIO MUSIC PERFORMANCE SYNTHESIS

Hao-Wen Dong, Taylor Berg-Kirkpatrick, Julian McAuley, Cong Zhou

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Short Course Bundle: ICIP 2023 COURSE 1: Short Course: Multimodal Learning: Technical Foundation, Hands-on and Applications (Parts 1-4)

Keynote: Natural Language Processing Approaches to Text Credibility and their Implications for Information Security

Keynote: Biometrics and Behavior for Information Forensics and Learning Assessment in Online Education

Join an IEEE Society