Cross-utterance ASR Rescoring with Graph-based Label Propagation
Srinath Tankasala (The University of Texas at Austin); Long Chen (Amazon); Andreas Stolcke (Amazon); Anirudh Raju (Amazon Alexa); Qianli Deng (Amazon); Chander Chandak (Amazon); Aparna Khare (Amazon); Roland Maas (Amazon Inc.); Venkatesh Ravichandran (Amazon)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
We propose a novel approach for ASR N-best hypothesis rescoring with graph-based label propagation by leveraging cross-utterance acoustic similarity. In contrast to conventional neural language model (LM) based ASR rescoring/reranking models, our approach focuses on acoustic information and conducts the rescoring collab- oratively among utterances, instead of individually. Experiments on the VCTK dataset demonstrate that our approach consistently im- proves ASR performance, as well as fairness across speaker groups with different accents. Our approach provides a low-cost solution for mitigating the majoritarian bias of ASR systems, without the need to train new domain- or accent-specific models.