HIERARCHICAL CLASSIFICATION OF SINGING ACTIVITY, GENDER, AND TYPE IN COMPLEX MUSIC RECORDINGS
Michael Krause, Meinard Müller
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:05:36
Traditionally, work on singing voice detection has focused on identifying singing activity in music recordings. In this work, our aim is to extend this task towards simultaneously detecting the presence of singing voice as well as determining singer gender and voice type. We describe and compare four strategies for exploiting the hierarchical relationships between these levels. In particular, we introduce a novel loss term that promotes consistency across hierarchy levels. We evaluate the strategies on a dataset containing over 200 hours of complex opera recordings with various singers of different genders and voice types, with a particular focus on hierarchical consistency. Our experiments show that by adding our loss term, a joint classification strategy using a single neural network achieves slightly improved evaluation scores and significantly more consistent results.