An Empirical Study Of Transformer-Based Neural Language Model Adaptation

Ke Li, Zhe Liu, Tianxing He, Hongzhao Huang, Fuchun Peng, Daniel Povey, Sanjeev Khudanpur

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 12:18

04 May 2020

We explore two adaptation approaches of deep Transformer based neural language models (LMs) for automatic speech recognition. The first approach is a pretrain-finetune framework, where we first pretrain a Transformer LM on a large-scale text corpus from scratch and then adapt it to relatively small target domains via finetuning. The second approach is a mixer of dynamically weighted models that are separately trained on source and target domains, aiming to improve simple linear interpolation with dynamic weighting. We compare the two approaches with three baselines - without adaptation, merging data, and simple interpolation - on Switchboard (SWBD) and Wall Street Journal (WSJ). Experiments show that the mixer model generally performs better than baselines and finetuning. Compared with no adaptation, finetuning and the mixer approach obtain up to relative 11.5% and 14.1% WER reductions on SWBD, respectively. The mixer model also outperforms linear interpolation and merging data. On WSJ, the mixer approach achieves a new state-of-the-art WER result.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

An Empirical Study Of Transformer-Based Neural Language Model Adaptation

Ke Li, Zhe Liu, Tianxing He, Hongzhao Huang, Fuchun Peng, Daniel Povey, Sanjeev Khudanpur

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join an IEEE Society