Skip to main content

Pre-Training For Query Rewriting In A Spoken Language Understanding System

Zheng Chen, Xing Fan, Yuan Ling, Lambert Mathias, Chenlei Guo

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 13:58
04 May 2020

Query rewriting (QR) is an increasingly important technique for reducing customer friction resulting from errors in a spoken language understanding pipeline originating from various sources such as speech recognition errors, language understanding errors or entity resolution errors. In this work, we first propose a neural-retrieval based approach for query rewriting. Then, inspired by the wide success of pre-trained contextual language embeddings, we propose a language-modeling (LM) based approach to pre-train query embeddings on historical user conversational data with a voice assistant, as a way to compensate for insufficient QR training data. We also propose to use the NLU hypotheses generated by the language understanding system to augment the pre-training. In experiments we show pre-training on the conversational data achieves strong performance for the QR task. We also show using the NLU hypotheses further benefit the performance. Finally, with pre-training that provides rich prior information, we find a small amount of query-rewrite pairs are enough to make the model outperform a strong baseline fully trained on all QR data.