Dnn-Based Mask Estimation Integrating Spectral And Spatial Features For Robust Beamforming
Chengyun Deng, Hui Song, Yongtao Sha, Yi Zhang, Xiangang Li
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 12:40
Spectral mask based beamforming has showed competitive performance on multi-channel speech enhancement in recent years. However, such methods apply mask estimation on each channel and ensemble the masks from multiple channels into one for speech and noise covariance estimation. Spectral-spatial mask estimation has not been well extended yet. In this paper, we propose a novel spectral-spatial mask based beamforming method for two-channel noisy signals, where spectral amplitude and cross-channel spatial features are integrated to improve mask estimation. Multi-channel masks are not merged in order to preserve channel characteristics for robust beamforming. Furthermore, this two-channel method is extended to six-channel scenario. Experiments on CHiME-3 evaluation confirm the superior performance of the proposed method over two spectral mask estimation approaches in terms of word error rates (WER) improvement.