Improving learning objectives for speaker verification from the perspective of score comparison
Min Hyun Han (Seoul National University); Sung Hwan Mun (Seoul National University); Minchan Kim (Seoul National University); Myeonghun Jeong (Seoul National University); Sunghwan Ahn (Seoul National University); Nam Soo Kim (Seoul National University)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Deep speaker embedding systems are usually trained with
classification-based or end-to-end learning objectives. Popular
end-to-end approaches utilize deep metric learning, which
can be viewed as a few-shot classification objective. In this
paper, we investigate the limit of conventional learning objectives
in speaker verification, and propose a new learning
objective designed from the perspective of similarity scores.
The proposed method trains a network by score comparison
unbound from the classification situation, which is more suitable
for verification tasks. Experiments conducted with various
network architectures demonstrate the improvements on
the VoxCeleb dataset using the proposed loss.