Cross-utterance ASR Rescoring with Graph-based Label Propagation

Srinath Tankasala (The University of Texas at Austin); Long Chen (Amazon); Andreas Stolcke (Amazon); Anirudh Raju (Amazon Alexa); Qianli Deng (Amazon); Chander Chandak (Amazon); Aparna Khare (Amazon); Roland Maas (Amazon Inc.); Venkatesh Ravichandran (Amazon)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

07 Jun 2023

We propose a novel approach for ASR N-best hypothesis rescoring with graph-based label propagation by leveraging cross-utterance acoustic similarity. In contrast to conventional neural language model (LM) based ASR rescoring/reranking models, our approach focuses on acoustic information and conducts the rescoring collab- oratively among utterances, instead of individually. Experiments on the VCTK dataset demonstrate that our approach consistently im- proves ASR performance, as well as fairness across speaker groups with different accents. Our approach provides a low-cost solution for mitigating the majoritarian bias of ASR systems, without the need to train new domain- or accent-specific models.

Tags:

Robust speech recognition and adaptation

Cross-utterance ASR Rescoring with Graph-based Label Propagation

Srinath Tankasala (The University of Texas at Austin); Long Chen (Amazon); Andreas Stolcke (Amazon); Anirudh Raju (Amazon Alexa); Qianli Deng (Amazon); Chander Chandak (Amazon); Aparna Khare (Amazon); Roland Maas (Amazon Inc.); Venkatesh Ravichandran (Amazon)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

DATA2VEC-AQC: SEARCH FOR THE RIGHT TEACHING ASSISTANT IN THE TEACHER-STUDENT TRAINING SETUP

BENCHMARK OF PHYSIOLOGICAL MODEL BASED AND DEEP LEARNING BASED REMOTE PHOTOPLETHYSMOGRAPHY IN AUTOMOTIVE

FAST AND PARALLEL DECODING FOR TRANSDUCER

Join an IEEE Society