ACCURATE AND RESOURCE-EFFICIENT LIPREADING WITH EFFICIENTNETV2 AND TRANSFORMERS

Alexandros Koumparoulis, Gerasimos Potamianos

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:09:25

13 May 2022

We present a novel resource-efficient end-to-end architecture for lipreading that achieves state-of-the-art results on a popular and challenging benchmark. In particular, we make the following contributions: First, inspired by the recent success of the EfficientNet architecture in image classification and our earlier work on resource-efficient lipreading models (MobiLipNet), we introduce EfficientNets to the lipreading task. Second, we show that the currently most popular in the literature 3D front-end contains a max-pool layer which prohibits networks from reaching superior performance and propose its removal. Finally, we improve our system?s back-end robustness by including a Transformer encoder. We evaluate our proposed system on the ?Lipreading In-The-Wild? (LRW) corpus, a database containing short video segments from BBC TV broadcasts. The proposed network (T-variant) attains 88.53% word accuracy, a 0.17% absolute improvement over the current state-of-the-art, while being five times less computationally intensive. Further, an up-scaled version of our model (L-variant) achieves 89.52%, a new state-of-the-art result on the LRW corpus.

Tags:

computational efficiency.

visual speech recognition

lipreading

transformers

efficientnet

ACCURATE AND RESOURCE-EFFICIENT LIPREADING WITH EFFICIENTNETV2 AND TRANSFORMERS

Alexandros Koumparoulis, Gerasimos Potamianos

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Short Course Bundle: ICASSP 2022 COURSE 6: Transformer Architectures for Multimodal Signal Processing and Decision Making (Parts 1-3)

IEEE PES Corporate Engagement Program and Technical Committees Informational Session - Part 1

Tutorial: Fundamentals of Transformers: A Signal-processing View

Join an IEEE Society