Exploring The Use Of Common Label Set To Improve Speech Recognition Of Low Resource Indian Languages

Vishwas M Shetty, Srinivasan Umesh

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:14:35

11 Jun 2021

In many Indian languages, written characters are organized on sound phonetic principles, and the ordering of characters is the same across many of them. However, while training conventional end-to-end (E2E) Multilingual speech recognition systems, we treat characters or target subword units from different languages as separate entities. Since the visual rendering of these characters is different, in this paper, we explore the benefits of representing such similar target subword units (e.g., Byte Pair Encoded(BPE) units) through a Common Label Set (CLS). The CLS can be very easily created using automatic methods since the ordering of characters is the same in many Indian Languages. E2E models are trained using a transformer-based encoder-decoder architecture. During testing, given the Mel-filterbank features as input, the system outputs a sequence of BPE units in CLS representation. Depending on the language, we then map the recognized CLS units back to the language-specific grapheme representation. Results show that models trained using CLS improve over monolingual baseline and a multilingual framework with separate symbols for each language. Similar experiments on a subset of the Voxforge dataset also confirm the benefits of CLS. An extension of this idea is to decode an unseen language (Zero-resource) using CLS trained model.

Chairs:

Zhijian Ou

Tags:

signal processing society

IEEE icassp 2021

virtual conference

2021

sps

virtual conference icassp 2021

june 6-11 2021

icassp 2021

Exploring The Use Of Common Label Set To Improve Speech Recognition Of Low Resource Indian Languages

Vishwas M Shetty, Srinivasan Umesh

Value-Added Bundle(s) Including this Product

ICASSP 2021 Virtual Conference - Presentation Videos Product Bundle

More Like This

Climat: Clinically-Inspired Multi-Agent Transformers For Knee Osteoarthritis Trajectory Forecasting

Aortic Arch Anatomy Characterization From Mra: A Cnn-Based Segmentation Approach

Collaborative Learning Of Images And Geometrics For Predicting Isocitrate Dehydrogenase Status Of Glioma

Join an IEEE Society