Skip to main content

Language Independent Gender Identification From Raw Waveform Using Multi-Scale Convolutional Neural Networks

Krishna D N, Amrutha D, Sai Sumith Reddy, Anudeepa Acharya, Triveni B J, Prabhu Aashish Garapati

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 12:50
04 May 2020

In this work, we propose a raw waveform based multi- scale convolution neural network approach for language- independent gender identification. Our approach uses raw audio waveform as input to the 1-dimensional multi-scale convolutional neural network instead of handcrafted feature for speaker gender classification. The multi-scale CNN has the advantage of using filters of different sizes on the audio waveform to extract features from raw waveform. We have a 3 stream CNN network where each stream contains multiple Residual blocks and we combine all the features from all streams after the last convolution layer to predict the gender label. Our gender identification dataset contains 176Hrs of audio data from 6 Indian languages(Hindi, English, Kannada, Telugu, Tamil, and Gujarati). Our experiments show that learning a gender identification task using a raw waveform gives better performance and speed up during training. Our experiments show that using multi-scale CNN on the raw waveform outperforms the spectrogram based model by an absolute improvement of 2.24%

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00