-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:57:07
Automatic meeting transcription is concerned with scripting conversations, enriched with information about who spoke when. This is a challenging task, because the speech signal captured by microphones from a distance is noisy and reverberated, and, depending on the nature of the meeting, can contain a high degree of overlapped speech, where more than one speaker is active at a time. Also, the interaction dynamics, where speakers articulate themselves in an intermittent manner, pose problems to conventional enhancement and recognition systems.Multi-talker meeting transcription thus calls for solving several tasks: source separation, diarization, and speech recognition. We will discuss approaches that address those tasks either separately or jointly, where the latter can lead to highly effective solutions. We will also touch upon "ad-hoc" configurations, where several, initially unsynchronized, microphones at unknown positions are used for signal capture. Finally, we will spend a few words on word error rate performance evaluation, which is less straightforward than one might think.