SEMI-SUPERVISED NEURAL CHORD ESTIMATION BASED ON A VARIATIONAL AUTOENCODER WITH LATENT CHORD LABELS AND FEATURES

Yiming Wu, Eita Nakamura, Kazuyoshi Yoshii, Tristan Carsault

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:11:15

10 May 2022

This paper describes a statistically-principled semi-supervised method of automatic chord estimation (ACE) that can make effective use of music signals regardless of the availability of chord annotations. The typical approach to ACE is to train a deep classification model in a supervised manner by using only annotated music signals. In this discriminative approach, prior knowledge about chord label sequences has scarcely been taken into account. In contrast, we propose a unified generative and discriminative approach in the framework of amortized variational inference. More specifically, we formulate a deep generative model that represents the generative process of chroma vectors from discrete labels and continuous features, which are assumed to follow a Markov model favoring self-transitions and a standard Gaussian distribution, respectively. Given chroma vectors as observed data, the posterior distributions of the latent labels and features are computed approximately by using deep classification and recognition models, respectively. These three models form a variational autoencoder and can be trained jointly in a semi-supervised manner. The experimental results show that the regularization of the classification model based on the Markov prior of chord labels and the generative model of chroma vectors improved the performance of ACE even under the supervised condition.

Tags:

null

SEMI-SUPERVISED NEURAL CHORD ESTIMATION BASED ON A VARIATIONAL AUTOENCODER WITH LATENT CHORD LABELS AND FEATURES

Yiming Wu, Eita Nakamura, Kazuyoshi Yoshii, Tristan Carsault

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

PROGRESS-ICASSP 2022: Introduction by Farokh Atashzar and Nancy F. Chen

PROGRESS-ICASSP 2022: Opening Speech

GENERALIZING AUC OPTIMIZATION TO MULTICLASS CLASSIFICATION FOR AUDIO SEGMENTATION WITH LIMITED TRAINING DATA

Join an IEEE Society