FEW-SHOT MUSICAL SOURCE SEPARATION

Yu Wang, Juan Pablo Bello, Daniel Stoller, Rachel M. Bittner

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:09:29

08 May 2022

Deep learning-based approaches to musical source separation are often limited to the instrument classes that the models are trained on and do not generalize to separate unseen instruments. To address this, we propose a few-shot musical source separation paradigm. We condition a generic U-Net source separation model using few audio examples of the target instrument. We train a few-shot conditioning encoder jointly with the U-Net to encode the audio examples into a conditioning vector to configure the U-Net via feature-wise linear modulation (FiLM). We evaluate the trained models on real musical recordings in the MUSDB18 and MedleyDB datasets. We show that our proposed few-shot conditioning paradigm outperforms the baseline one-hot instrument-class conditioned model for both seen and unseen instruments. To extend the scope of our approach to a wider variety of real-world scenarios, we also experiment with different conditioning example characteristics, including examples from different recordings, with multiple sources, or negative conditioning examples.

Tags:

source separation

music

few-shot learning

film conditioning

FEW-SHOT MUSICAL SOURCE SEPARATION

Yu Wang, Juan Pablo Bello, Daniel Stoller, Rachel M. Bittner

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

KEYNOTE: Learning from Data in post-Foundation Models Era: bringing learning and reasoning together

TASK-AGNOSTIC OPEN-SET PROTOTYPE FOR FEW-SHOT OPEN-SET RECOGNITION

FEW-SHOT HYPERSPECTRAL IMAGE CLASSIFICATION BASED ON CROSS-DOMAIN SPECTRAL SEMANTIC RELATION TRANSFORMER

Join an IEEE Society