Multi-Talker Meeting Transcription

Reinhold Haeb-Umbach

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:57:07

Invited Speech 18 Dec 2023

Automatic meeting transcription is concerned with scripting conversations, enriched with information about who spoke when. This is a challenging task, because the speech signal captured by microphones from a distance is noisy and reverberated, and, depending on the nature of the meeting, can contain a high degree of overlapped speech, where more than one speaker is active at a time. Also, the interaction dynamics, where speakers articulate themselves in an intermittent manner, pose problems to conventional enhancement and recognition systems.Multi-talker meeting transcription thus calls for solving several tasks: source separation, diarization, and speech recognition. We will discuss approaches that address those tasks either separately or jointly, where the latter can lead to highly effective solutions. We will also touch upon "ad-hoc" configurations, where several, initially unsynchronized, microphones at unknown positions are used for signal capture. Finally, we will spend a few words on word error rate performance evaluation, which is less straightforward than one might think.

Tags:

IEEE ASRU 2023

automatic speech recognition

Multi talker meeting transcription

Multi-Talker Meeting Transcription

Reinhold Haeb-Umbach

More Like This

End-to-End Automatic Speech Recognition

Neural Signal Interpretation for Spoken Communication

Towards a Speech Version of ChatGPT

Join an IEEE Society