Background-Tolerant Object Classification With Embedded Segmentation Mask For infrared and Color Imagery

Maliha Arif, Calvin Yong, Abhijit Mahalanobis, Nazanin Rahnavard

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:16:46

18 Oct 2022

Holistic understanding of videos requires the recognition of the overall scene beyond detecting foreground activity and objects. It provides valuable information for various video understanding tasks such as video summarization, scene change detection and content filtering. While significant effort has been put into developing models for scene classification in images (e.g. Places365), video-level scene recognition is relatively nascent. The scope of this paper is to address this problem of going from image representations to video for scene classification. in particular, we compare self-supervised deep learning methods on video scene recognition task using the HVU dataset. Starting from strong image level scene representations, with triplets based contrastive loss, we train a video-level scene classifier. We propose triplet sampling strategies that aid the self-supervision. We compare the self-supervised techniques against the image level scene representations, as well as a weakly supervised classifier trained on image labels. We observe that the models learned using self-supervised method outperform both baselines (with statistical significance), showing that we are able to retain the representative power of the video-level scene representations compared to a competitive image-level scene recognition model trained on Places365, while showing benefits over weakly supervised techniques.

Tags:

International Conference on Image Processing

IEEE ICIP 2022

icip

Background-Tolerant Object Classification With Embedded Segmentation Mask For infrared and Color Imagery

Maliha Arif, Calvin Yong, Abhijit Mahalanobis, Nazanin Rahnavard

Value-Added Bundle(s) Including this Product

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

More Like This

Deep Weighted Consensus Dense Correspondence Confidence Maps For 3D Shape Registration

Statistical Analysis of inter Coding in Vvc Test Model (Vtm)

Joint Motion Correction and 3D Segmentation With Graph-Assisted Neural Networks For Retinal Oct

Join an IEEE Society