Action Quality Assessment With Ignoring Scene Context
Takasuke Nagai, Shoichiro Takeda, Masaaki Matsumura, Shinya Shimizu, Susumu Yamamoto
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:07:44
We propose an action quality assessment (AQA) method that can specifically assess target action quality with ignoring scene context, which is a feature unrelated to the target action. Existing AQA methods have tried to extract spatiotemporal features related to the target action by applying 3D convolution to the video. However, since their models are not explicitly designed to extract the features of the target action, they mis-extract scene context and thus cannot assess the target action quality correctly. To overcome this problem, we impose two losses to an existing AQA model: scene adversarial loss and our newly proposed human-masked regression loss. The scene adversarial loss encourages the model to ignore scene context by adversarial training. Our human-masked regression loss does so by making the correlation between score outputs by an AQA model and human referees undefinable when the target action is not visible. These two losses lead the model to specifically assess the target action quality with ignoring scene context. We evaluated our method on a diving dataset commonly used for AQA and found that it outperformed current state-of-the-art methods. This result shows that our method is effective in ignoring scene context while assessing the target action quality.