Large-Scale and Parameter-Efficient Language Modeling for Speech Processing
Huck Yang, Eng-Siong Chng, Andreas Stolcke
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 02:46:02
In this tutorial, we introduce the evolution of language models (LMs) for speech recognition, focusing on the recent advances [1,2] in large-scale generative language models (1B+) and parameter-efficient learning techniques designed for cross-modal adaptation. Additionally, we will introduce a new open-source benchmark, HyPoradise (Chen et al., NeurIPS 2023), which provides open-source n-best hypotheses and reproducible pre-trained language models for speech processing. With rising interests of using frozen pre-trained models for diverse downstream applications, how to design a “performance-effective” and “parameter-efficient” LLM fine-tuning framework is one open topic. We aim to provide an in-depth summary and draw a taxonomy on the differences of parameter-efficient learning modules [3]. The presenting topic is emerging as an essential pathway to design foundation models for the research community.