Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 00:10:57
21 Sep 2021

Unsupervised learning of disentangled representations is a core task for discovering interpretable factors of variation in an image dataset. We propose a novel method that can learn disentangled representations with semantic explanations on natural image datasets. In our method, we guide the representation learning of a variational autoencoder (VAE) via reconstruction in a visual-semantic embedding (VSE) space to leverage the semantic information of image data and explain the learned latent representations in an unsupervised manner. We introduce a semantic sub-encoder and a linear semantic sub-decoder to learn word vectors corresponding to the latent variables to explain factors of variation in the language form. Each basis vector (column) of the linear semantic sub-decoder corresponds to each latent variable, and we can interpret the basis vectors as word vectors indicating the meanings of the latent representations. By introducing the sub-encoder and the sub-decoder, our model can learn latent representations that are not just disentangled but interpretable. Comparing with other state-of-the-art unsupervised disentangled representation learning methods, we observe significant improvements in the disentanglement and the transferability of latent representations.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00