Noise-Disentanglement Metric Learning for Robust Speaker Verification
Yao Sun (Tianjin University); Hanyi zhang (tianjin university); Longbiao Wang (Tianjin University); Kong Aik Lee (Institute for Infocomm Research, ASTAR); Meng Liu (Tianjin University); Jianwu Dang (Tianjin University)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Automatic speaker verification (ASV) suffers from performance degradation
in noisy environments. To solve this problem, we propose the noise-disentanglement metric learning to reduce the speaker-irrelevant noisy components and build a noise-invariant embedding space. Specifically, the disentanglement module, including the speaker encoder and reconstruction module, is dedicated to decoupling speech signals. The speaker encoder is used to disentangle speaker-related components, and the reconstruction module increases the model's ability to constrain the noise information by reconstructing the signal. In addition, distribution optimization is introduced to supervise the spatial structure of speaker embeddings under noisy environments. Experiments on VoxCeleb1 indicate that the proposed method improves the performance of the speaker verification system in both clean and noisy conditions.