Temporal Action Proposal Generation via Deep Feature Enhancement
He-Yen Hsieh, Ding-Jie Chen, Tyng-Luh Liu
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 13:27
Temporal action proposal generation (TAPG) is a challenging problem for analyzing video content. It aims to localize the video segments which are likely to contain actions or events. Intuitively, making a satisfying prediction of these video segments is directly relies on their representation quality. A typical representation of a video segment is applying a two-stream feature, which comprises appearance and motion information. Rather than directly concatenating the two-stream features as the previous methods, we illustrate a feature-aggregation network (FA-Net) concerning the feature-relation among neighboring video segments for obtaining the high-quality representation that better characterizing the actions or events. Further, we design a feature-expansion network (FE-Net) to extract multi-granularity features for retrieving the proposals of high action-instance covering confidence. We evaluate our approach on two challenging datasets: ActivityNet-1.3 and THUMOS-14. The experiments showed that the proposed approach consistently outperforms the existing state-of-the-art TAPG methods.