Skip to main content

FREQUENCY ENHANCEMENT NETWORK FOR EFFICIENT COMPRESSED VIDEO ACTION RECOGNITION

Yue Ming, Lu Xiong, Xia Jia, Qingfang Zheng, Jiangwan Zhou, Fan Feng, Nannan Hu

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
Poster 10 Oct 2023

The existing frequency-based action recognition methods achieve impressive performance in improving efficiency. However, they ignore the low-frequency texture and edge clues, leading to accuracy degradation. To address this problem, we propose a novel frequency enhancement (FE) block for efficient compressed video action recognition, including a temporal-channel two-heads attention (TCTHA) module and a frequency overlapping group convolution (FOGC) module. First, the TCTHA module emphasizes the inter-frame temporal context and the inner-frame informative frequency semantics by attention.Then, the FOGC module groups channels in different frequency bands with overlap, to extract low-frequency texture and edge clues, while maintaining the interaction of groups. We integrate the FE block into 2D-CNNs with frequency I-frame input, termed FENet, focusing on the pivotal low-frequency spatio-temporal semantics for action recognition. Experiments on HMDB-51, UCF-101, Kinetics-400, and Kinetics-700 verify that our FENet achieves comparable accuracy compared with RGB-based methods with high efficiency.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00