Skip to main content

USING MODIFIED ADULT SPEECH AS DATA AUGMENTATION FOR CHILD SPEECH RECOGNITION

Zijian Fan (Norwegian University of Science and Technology); Xinwei Cao (NTNU); Giampiero Salvi (NTNU); Torbjørn Svendsen (NTNU)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
07 Jun 2023

Data augmentation is a technique which enhances the size and quality of training data such that deep learning or machine learning models can achieve better performance. This paper proposes a novel way of applying data augmentation for child speech recognition in the low data resource scenario. Data augmentation is achieved by modifying existing adult speech signals. The procedure consists of two main parts, resampling, and time scaling. The experiment involves both speech from children aged from kindergarten to grade 10, and adults’ speech. We test the proposed method using both a TDNN-HMM and a GMM-HMM acoustic model. The results show that the proposed data augmentation scheme achieves a relative 7.95% reduction of WERs compared with 4.56% relative reduction when using a traditional bilinear frequency warping approach.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00