Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 00:15:12
07 Oct 2022

in visual tracking, the size of template patch on image is usually several times the size of object bounding box. So the background information around object would be encoded into some template features. However, these features would also be matched with search features, which interferes with the tracker's ability to accurately separate the object from the background. in this work, we present a novel feature fusion network based on Transformer for visual tracking. Specifically, to reduce the interference of background information in template patch, we extract the template features corresponding to foreground region on image, called TFFR, and fuse them with search features by attention mechanism. On that basis, we design a concise Transformer visual tracker based on TFFR, called TVT-TFFR. Extensive experiments show that our TVT-TFFR achieves state-of-the-art performance on several prevalent tracking benchmarks, and runs at 38 FPS, meeting the real-time requirement.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00