Phase-Aware Spoof Speech Detection Based on Res2Net with Phase Network

Juntae Kim (SK Telecom); Sung Min Ban (SK Telecom)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

07 Jun 2023

For automatic speaker verification systems, spoof speech detection (SSD) is an essential countermeasure. Although SSD with magnitude features in the frequency domain has shown promising results, phase information can also be useful in capturing the artefacts of certain spoofing attacks. Thus, both magnitude and phase features must be considered to ensure the ability to generalize diverse types of spoofing attacks. In this study, we discovered that the randomness difference between magnitude and phase features is large, which can interrupt the feature-level fusion via backend neural network. In this regard, we propose a phase network to reduce that difference, which makes the Res2Net-based feature-level fusion feasible. To validate our SSD system for practical environment, both known- and unknown-type SSD scenarios are considered. As a result, our SSD system delivers competitive results compared to other state-of-the-art SSD systems in all scenarios.

Tags:

Speaker recognition/identification/diarization

Phase-Aware Spoof Speech Detection Based on Res2Net with Phase Network

Juntae Kim (SK Telecom); Sung Min Ban (SK Telecom)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Moving Towards Non-Binary Gender Identification Via Analysis of System Errors in Binary Gender Classification

INCORPORATING UNCERTAINTY FROM SPEAKER EMBEDDING ESTIMATION TO SPEAKER VERIFICATION

Jeffreys divergence-based regularization of neural network output distribution applied to speaker recognition

Join an IEEE Society