FastAudio: A Learnable Audio Front-End for Spoof Speech Detection
Quchen Fu, Zhongwei Teng, Jules White, Douglas Schmidt, Maria Powell
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:07:30
Spoof speech can be used to try and fool speaker verification systems that determine the identity of the speaker based on voice characteristics. This paper compares popular learnable front-ends on this task. We categorize the front-ends by defining two generic architectures and then analyze the filtering stages of both types in terms of learning constraints. We propose replacing fixed filterbanks with a learnable layer that can better adapt to anti-spoofing tasks. The proposed FastAudio front-end is then tested with two popular back-ends to measure the performance on the Logical Access track of the ASVspoof 2019 dataset. The FastAudio front-end achieves a relative improvement of 29.7% when compared with fixed front-ends, outperforming all other learnable front-ends on this task.