Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
Lecture 09 Oct 2023

Face synthesis is a rapidly growing area of research in computer vision. Text-driven face synthesis is particularly flexible, but challenges still exist in fusing the semantics of text and images, as well as generating diverse faces. To address these challenges, we propose a cross-modality adversarial learning framework to generate highly diverse face videos that correspond to given text descriptions. We encode text and images into a common latent space and align text and image features to control the synthesis of face attributes. We have designed a novel auto-encoder with a face identity discriminator that enlarges the margin between different individuals, increasing the variety of created faces while maintaining the semantic coherence of text and images. Our proposed method has been successfully tested on the recently released Multimodal VoxCeleb dataset. Our code is public available at https://github.com/sunmeng7/TYS.git.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00