TALKINGFLOW: TALKING FACIAL LANDMARK GENERATION WITH MULTI-SCALE NORMALIZING FLOW NETWORK

Sen Liang, Zhize Zhou, Hujun Bao, Rong Li, Juyong Zhang

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:05:36

08 May 2022

Deterministic models dominate the field of talking facial landmark generation by directly mapping speech signals to a certain lip-sync facial landmark sequence, which often suffer from regression to the mean face. In contrast, probability generative models are more beneficial to handle complex data space and generate diverse samples. In this work, we propose a flow-based probabilistic network named TalkingFlow to generate natural talking facial landmark with head movements from speech data. It is implemented by a weighted multi-scale architecture to improve model representation capability and a conditional Temporal Convolutional Network module to fuse speech data. Extensive experiments results show that it can effectively generate diverse and natural facial landmark from speech data. All code will be made publicly available online.

Tags:

talking head

facial landmark

generative model

motion synthesis

normalizing flow

IEEE Resource Center

TALKINGFLOW: TALKING FACIAL LANDMARK GENERATION WITH MULTI-SCALE NORMALIZING FLOW NETWORK

Sen Liang, Zhize Zhou, Hujun Bao, Rong Li, Juyong Zhang

Join an IEEE Society