Maximum A Posteriori Estimator For Convolutive Sound Source Separation With Sub-Source Based Ntf Model And The Localization Probabilistic Prior On The Mixing Matrix
Mieszko Fraś, Konrad Kowalczyk
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:10:14
In this paper we present a method for the separation of sound source signals recorded using multiple microphones in a reverberant room. In particular, we propose a maximum a posteriori (MAP) estimator based on the multichannel nonnegative tensor factorization (NTF) model with the localization prior distribution on the mixing matrix, in which the latent data consists of the so-called sub-sources for an improved performance in a reverberant environment. For the proposed MAP estimator, we derive the sub-source based expectation maximization (EM) algorithm with the multiplicative update rules (MU) and the localization prior distribution (LP) on the mixing matrix (SSEM-MU-LP). We then perform several experiments for speech and instrumental sound sources recorded using two microphones, in determined and under-determined scenarios, and with different types of initialization of the model parameters. The results of these experiments clearly indicate a significant improvement of the proposed algorithm with the localization prior over the state-of-the-art NTF-based source separation algorithms, which can reach up to $50\%$ in the signal-to-distortion ratio.
Chairs:
Jonathan Le Roux