Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 15:40
04 May 2020

The spectral information of acoustic scenes is diverse and complex, which poses challenges for acoustic scene tasks. To improve the classification performance, a variety of convolutional neural networks (CNNs) are proposed to extract richer semantic information of scene utterances. However, the different regions of the features extracted from CNN-based encoder have different importance. In this paper, we propose a novel strategy for acoustic scene classification, namely high-resolution attention network with acoustic segment model (HRAN-ASM). In this approach, we utilize fully CNN to obtain high-level semantic information and then adopt two-stage attention strategy to select the relevant acoustic scene segments. Besides, the acoustic segment model (ASM) proposed in our recent work provides embedding vectors for this attention mechanism. The performance is evaluated on DCASE 2018 Task 1a, showing 70.5% good classification accuracy under single system and no data expansion, which is superior to CNN-based self-attention mechanism and highly competitive.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00