Speaker Diarization and Automatic Analysis Methods of Audio from Individuals with Autism Spectrum Disorder

Dorsey Beckles, Luca Decicco, Chandler Mason, Zhaozhou Tang, Desmond Caulley, David V Anderson

DOI

RFID

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:10:27

27 Apr 2021

Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder that affects an individual's communication skills and social interactions. Currently, the only way to determine if a child is autistic, is to send them to several different doctors and specialist. This can be very expensive, time consuming, and very often, doctors reach different conclusions on whether the child has autism. Previous research has shown that analyzing a day's audio recording is sufficient for autism diagnosis. The object of this research is developing a tool to automatically analyze audio recordings and reach conclusion on (ASD). The first step in this research is speaker diarization. This goal of diarization is to partition the audio to determine when the child is talking versus when parents are talking. Currently, we have gathered data from Voxceleb to create a large-scaledataset of human speech. Extracting the speech from an audio sample required a concept known as the Mel-Frequency Cepstral Coefficients (MFCC) to be used. The data was then organized using a speaker diarization tool known as Kaldi. This semester, we plan to create an autism spectrum disorder transfer learning model to train our data using the Voxceleb dataset. With that we will be able to accurately come closer to understanding the algorithm for detecting ASD. Additionally, we will also identify questions segments and determine the child's rate to question from parents.

Tags:

luca decicco

zhaozhou tang

desmond caulley

dorsey beckles

autism spectrum disorder