TINYS2I: A SMALL-FOOTPRINT UTTERANCE CLASSIFICATION MODEL WITH CONTEXTUAL SUPPORT FOR ON-DEVICE SLU

Anastasios Alexandridis, Kanthashree Mysore Sathyendra, Grant Strimel, Pavel Kveton, Jon Webb, Athanasios Mouchtaris

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:09:42

11 May 2022

On-device spoken language understanding (SLU) offers the potential for significant latency savings compared to cloud-based processing, as the audio stream does not need to be transmitted to a server. We present Tiny Signal-to-interpretation (TinyS2I), an end-to-end on-device SLU approach which is focused on heavily resource constrained devices. TinyS2I brings latency reduction without accuracy degradation, by exploiting use cases when the distribution of utterances that users speak to a device is largely heavy-tailed. The model is tailored to process on-device frequent utterances with support for dynamic contextual content, while deferring all other requests to the cloud. Compared to a powerful baseline, we demonstrate that TinyS2I achieves comparable performance, while offering latency gains due to local processing.

Tags:

automatic speech recognition

on-device

latency reduction

spoken language understanding

TINYS2I: A SMALL-FOOTPRINT UTTERANCE CLASSIFICATION MODEL WITH CONTEXTUAL SUPPORT FOR ON-DEVICE SLU

Anastasios Alexandridis, Kanthashree Mysore Sathyendra, Grant Strimel, Pavel Kveton, Jon Webb, Athanasios Mouchtaris

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

End-to-End Automatic Speech Recognition

Towards a Speech Version of ChatGPT

Neural Signal Interpretation for Spoken Communication

Join an IEEE Society