HARMONIC-TEMPORAL FACTOR DECOMPOSITION FOR UNSUPERVISED MONAURAL SEPARATION OF HARMONIC SOUNDS

Tomohiko Nakamura, Hirokazu Kameoka

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:15:11

08 May 2022

We address the problem of separating a monaural mixture of harmonic sounds into the audio signals of individual semitones in an unsupervised manner. Unsupervised monaural audio source separation has been mainly addressed by two approaches: one rooted in computational auditory scene analysis (CASA) and the other based on non-negative matrix factorization (NMF). These approaches focus on different clues for making source separation possible. A CASA-based method, harmonic-temporal clustering (HTC), focuses on a local time-frequency structure of individual sources, whereas NMF focuses on a global time-frequency structure of music spectrograms. Focusing on a fact that these clues do not conflict with each other, we propose a monaural source separation framework, harmonic-temporal factor decomposition (HTFD), by developing a spectrogram model that encompasses the features of the models used in the NMF and HTC approaches. We further incorporate a source-filter model to build an extension of HTFD, source-filter HTFD (SF-HTFD). We derive efficient parameter estimation algorithms of HTFD and SF-HTFD based on the auxiliary function principle. We show, through audio source separation experiments, the efficacy of HTFD and SF-HTFD compared with conventional methods. Furthermore, we demonstrate the effectiveness of HTFD and SF-HTFD for automatic musical key transposition.

Tags:

null

HARMONIC-TEMPORAL FACTOR DECOMPOSITION FOR UNSUPERVISED MONAURAL SEPARATION OF HARMONIC SOUNDS

Tomohiko Nakamura, Hirokazu Kameoka

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

PROGRESS-ICASSP 2022: Introduction by Farokh Atashzar and Nancy F. Chen

PROGRESS-ICASSP 2022: Opening Speech

GENERALIZING AUC OPTIMIZATION TO MULTICLASS CLASSIFICATION FOR AUDIO SEGMENTATION WITH LIMITED TRAINING DATA

Join an IEEE Society