Skip to main content

Guidance And Teaching Network For Video Salient Object Detection

Yingxia Jiao, Xiao Wang, Yu-Cheng Chou, Shouyuan Yang, Ge-Peng Ji, Rong Zhu, Ge Gao

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 00:07:27
21 Sep 2021

Owing to the difficulties of mining spatial-temporal cues,the existing approaches for video salient object detection(VSOD) are limited in understanding complex and noisy scenarios, and often fail in inferring prominent objects. Toalleviate such shortcomings, we propose a simple yet effi-cient architecture, termed Guidance and Teaching Network(GTNet), to independently distil effective spatial and temporal cues with implicit guidance and explicit teaching at feature- and decision-level, respectively. To be specific, we (a) introduce temporal modulator to implicitly bridge fea-tures from motion into appearance branch, which is capable of fusing cross-modal features collaboratively, and (b) utilise motion-guided mask to propagate the explicit cues during the feature aggregation. This novel learning strategy achieves satisfactory results via decoupling the complex spatial-temporal cues and mapping informative cues across different modalities. Extensive experiments on three challenging benchmarks show that the proposed method can run at ƒ?¬28fpson a single TITAN Xp GPU and perform competitively against 14cutting-edge baselines.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00