MASSIVELY MULTILINGUAL ASR: A LIFELONG LEARNING SOLUTION
Bo Li, Ruoming Pang, Yu Zhang, Tara Sainath, Trevor Strohman, Parisa Haghani, Yun Zhu, Brian Farris, Neeraj Gaur, Manasa Prasad
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:14:52
The development of end-to-end models has largely sped up the research in massively multilingual automatic speech recognition (MMASR). Previous research has demonstrated the feasibility to build high quality MMASR models. In this work, we study the impact of adding more languages and propose a lifelong learning approach to build high quality MMASR systems. Experiments on a 66-language Voice Search task show that we can take a model built on 15 languages and continue training to obtain a 32-language model and similarly to further build a 67-language model. More importantly, models developed in this way achieve better quality compared to those trained from scratch. It maintains similar performance on old languages and achieves competing results on new ones. This would potentially speed up the development of universal ASR models that recognize speech from any language, any domain and any environment by reusing knowledge learned beforehand.