AUTOMATIC DEPRESSION LEVEL ASSESSMENT FROM SPEECH BY LONG-TERM GLOBAL INFORMATION EMBEDDING
Ya Li, Mingyue Niu, Ziping Zhao, Jianhua Tao
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:08:59
Depression is a serious mood disorder which brings negative effects on people's social activities. Therefore, growing attention has been paid to automatic depression assessment, especially from speech. However, most of the previous work uses hand-crafted features or deep neural network-based feature extractors to obtain deep features and then feed them into a classifier or a regression, which ignores the temporal relation of these features. To address this issue, this paper proposes a global information embedding (GIE) to make use of the long-term global information of depression and re-weight the LSTM output sequence. The short-term features are then pooled into long-term features by LASSO optimization to further improve the accuracy of depression recognition. Experiments on AVEC 2013 and AVEC 2014 verified the proposed method, and the RMSEs are 9.63 and 9.40, respectively.