Speaker Diarization and Automatic Analysis Methods of Audio from Individuals with Autism Spectrum Disorder
Dorsey Beckles, Luca Decicco, Chandler Mason, Zhaozhou Tang, Desmond Caulley, David V Anderson
-
RFID
IEEE Members: $11.00
Non-members: $15.00Length: 00:10:27
Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder that affects an individual's communication skills and social interactions. Currently, the only way to determine if a child is autistic, is to send them to several different doctors and specialist. This can be very expensive, time consuming, and very often, doctors reach different conclusions on whether the child has autism. Previous research has shown that analyzing a day's audio recording is sufficient for autism diagnosis. The object of this research is developing a tool to automatically analyze audio recordings and reach conclusion on (ASD). The first step in this research is speaker diarization. This goal of diarization is to partition the audio to determine when the child is talking versus when parents are talking. Currently, we have gathered data from Voxceleb to create a large-scaledataset of human speech. Extracting the speech from an audio sample required a concept known as the Mel-Frequency Cepstral Coefficients (MFCC) to be used. The data was then organized using a speaker diarization tool known as Kaldi. This semester, we plan to create an autism spectrum disorder transfer learning model to train our data using the Voxceleb dataset. With that we will be able to accurately come closer to understanding the algorithm for detecting ASD. Additionally, we will also identify questions segments and determine the child's rate to question from parents.