Multi-Class Spectral Clustering With Overlaps For Speaker Diarization
Desh Raj, Zili Huang, Sanjeev Khudanpur
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 0:15:07
This paper describes a method for overlap-aware speaker diarization. Given an overlap detector and a speaker embedding extractor, our method performs spectral clustering of segments under overlap constraints. This is achieved by transforming the discrete clustering problem into a relaxed optimization problem which is solved by eigen-decomposition. Thereafter, we discretize the solution by alternatively using singular value decomposition and a modified version of non-maximal suppression which is constrained by the output of the overlap detector. Furthermore, we detail an HMM-DNN based overlap detector which performs frame-level classification and enforces constraints through HMM state transitions. Our method achieves a test DER of 24.0% on the mixed-headset setting of the AMI meeting corpus, which is a relative improvement of 15.2% over a strong agglomerative hierarchical clustering baseline, and compares favorably with other overlap-aware diarization methods. Further analysis on the LibriCSS data demonstrates the effectiveness of the proposed method in high overlap conditions.